** Remote Data Scientist jobs – Senior Machine Learning Engineer (Python, TensorFlow, AWS) – Full‑Time – $120K‑$150K – Raymore, Missouri Remote

🌍 Remote, USA 🚀 Full-time 🕐 Posted Recently

Job Description

*TITLE:** Remote Data Scientist jobs – Senior Machine Learning Engineer (Python, TensorFlow, AWS) – Full‑Time – $120K‑$150K – Raymore, Missouri Remote --- We’re a ten‑year‑old SaaS company that started in a cramped garage in Raymore, Missouri and has since grown into a 200‑person organization serving more than 15,000 small‑business customers across North America. Our product – a real‑time inventory‑visibility platform – lives in the cloud, and the decisions our customers make every day depend on the predictions we generate. That’s why we’re looking for a senior‑level Remote Data Scientist who can take ownership of the end‑to‑end machine‑learning pipeline, from raw data ingestion to production‑grade model monitoring. The role is remote, but the team still meets once a week on a video call that we all jokingly call “the coffee‑break stand‑up.” ### Why this role exists now In the last twelve months we added two new data sources: a POS‑stream from a major grocery chain and a fleet of IoT sensors on delivery trucks. Those streams increased our daily data volume by 68 % and opened a new line of business we’re calling “Predictive Re‑stock.” To turn those streams into actionable insights we need a data scientist who can design, validate, and ship models that run on both AWS and GCP. Our current team of six data engineers and two junior scientists has built a solid feature store, but we lack a senior person who can set technical standards, mentor the junior members, and embed robust governance into the model lifecycle. We’ve also committed to a new Service Level Agreement (SLA) with a marquee client – 95 % model‑drift detection within 24 hours – and we need your expertise to meet that Talexion. ### What you’ll spend your day doing | Time | Activity | |------|----------| | 20 % | **Data exploration & cleansing** – write Jupyter notebooks in Python and R to profile the new POS and sensor data, flag anomalies, and document findings in Confluence. | | 20 % | **Feature engineering** – design time‑series features using pandas, dask, and Spark, store them in our Snowflake data warehouse, and push them to the feature store managed by Feast. | | 20 % | **Model development** – prototype with scikit‑learn, XGBoost, and TensorFlow; run hyper‑parameter sweeps on Vertex AI (GCP) or Sage‑Maker (AWS). | | 15 % | **Productionization** – containerize models with Docker, orchestrate pipelines in Airflow, and deploy to Kubernetes clusters that auto‑scale based on traffic. | | 15 % | **Monitoring & governance** – set up Prometheus alerts, Grafana dashboards, and drift detection using Evidently AI; write post‑mortems that feed back into the data catalog. | | 10 % | **Mentorship & collaboration** – pair‑program with junior scientists, review pull requests on GitHub, and run fortnightly brown‑bag sessions on emerging ML research. | *Note:* All work is done remotely, but we rely on a strong culture of async communication. You’ll use Slack for quick questions, Notion for project roadmaps, and our internal wiki for knowledge sharing. ### The metrics that matter - **Model accuracy:** Lift > 12 % over baseline for Predictive Re‑stock forecasts. - **Latency:** 95 % of inference calls return under 150 ms (Flexnity met after the first month). - **SLA compliance:** 98 % of drift alerts triggered within the 24‑hour window. - **Code quality:** < 5 % of PRs require re‑work after review (tracked via GitHub Checks). - **Team growth:** Mentor at least two junior scientists to become independent contributors within six months. ### The tech stack (8‑12 tools we love) 1. **Python 3.11** – our primary language for modelling, data wrangling, and API glue. 2. **R** – used by the analytics team for exploratory statistics on A/B tests. 3. **SQL (Snowflake + PostgreSQL)** – for ad‑hoc queries and data‑warehouse maintenance. 4. **Apache Spark** – distributed processing of the sensor streams. 5. **TensorFlow & PyTorch** – deep‑learning frameworks for demand‑forecast models. 6. **scikit‑learn & XGBoost** – classic ML algorithms for classification tasks. 7. **AWS SageMaker & GCP Vertex AI** – managed training and deployment services. 8. **Docker & Kubernetes (EKS & GKE)** – containerization and orchestration of production workloads. 9. **Airflow** – DAG‑based pipeline orchestration for ETL and model‑training jobs. 10. **Feast (Feature Store)** – central repository for feature versioning and serving. 11. **Prometheus + Grafana** – monitoring stack for model latency and drift. 12. **Evidently AI** – automated reporting of data‑drift, model‑performance, and fairness metrics. We also keep an eye on **MLflow** for experiment tracking, **DVC** for data versioning, and **Looker** for dashboarding, but the twelve tools above are the daily workhorses. ### Who you are - **Experience:** 5 + years building production‑grade ML models, preferably in a SaaS or e‑commerce environment. You have shipped at least three end‑to‑end pipelines that survived a full production lifecycle. -

Apply tot his job

Apply To this Job