AI Skills for Analytics Engineer
Discover 155+ AI skills for analytics engineers
Install any skill with /learn
/learn @owner/skill-namewshobson / data-quality-frameworks
Implement data quality validation with Great Expectations, dbt tests, and data contracts. Use when building data quality pipelines, implementing validation rules, or establishing data contracts.
wshobson / dbt-transformation-patterns
Master dbt (data build tool) for analytics engineering with model organization, testing, documentation, and incremental strategies. Use when building data transformations, creating data models, or implementing analytics engineering best practices.
wshobson / spark-optimization
Optimize Apache Spark jobs with partitioning, caching, shuffle optimization, and memory tuning. Use when improving Spark performance, debugging slow jobs, or scaling data processing pipelines.
openclaw / senior-data-scientist
World-class data science skill for statistical modeling, experimentation, causal inference, and advanced analytics. Expertise in Python (NumPy, Pandas, Scikit-learn), R, SQL, statistical methods, A/B testing, time series, and business intelligence. Includes experiment design, feature engineering,...
openclaw / spark-engineer
Use when building Apache Spark applications, distributed data processing pipelines, or optimizing big data workloads. Invoke for DataFrame API, Spark SQL, RDD operations, performance tuning, streaming analytics.
openclaw / jq
Command-line JSON processor. Extract, filter, transform JSON.
openclaw / elastic
Search and analyze data via Elasticsearch API. Index, search, and manage clusters.
openclaw / grafana
Manage Grafana dashboards, data sources, and alerts via API. Visualize metrics and logs.
openclaw / splunk
Search and analyze machine data via Splunk API. Run searches and manage dashboards.
majiayu000 / aeon
This skill should be used for time series machine learning tasks including classification, regression, clustering, forecasting, anomaly detection, segmentation, and similarity search. Use when working with temporal data, sequential patterns, or time-indexed observations requiring specialized algo...
majiayu000 / aeon
This skill should be used for time series machine learning tasks including classification, regression, clustering, forecasting, anomaly detection, segmentation, and similarity search. Use when working with temporal data, sequential patterns, or time-indexed observations requiring specialized algo...
majiayu000 / aggregating-gauge-metrics
Aggregate pre-computed metrics (gauge, counter, delta types) using OPAL. Use when analyzing request counts, error rates, resource utilization, or any numeric metrics over time. Covers align + m() + aggregate pattern, summary vs time-series output, and common aggregation functions. For percentile ...
majiayu000 / ai-ml-data-science
End-to-end data science and ML engineering workflows: problem framing, data/EDA, feature engineering (feature stores), modelling, evaluation/reporting, plus SQL transformations with SQLMesh. Use for dataset exploration, feature design, model selection, metrics and slice analysis, model cards/eval...
majiayu000 / data-engineer
Build scalable data pipelines, modern data warehouses, and
majiayu000 / apache-spark-data-processing
Complete guide for Apache Spark data processing including RDDs, DataFrames, Spark SQL, streaming, MLlib, and production deployment
majiayu000 / architecting-data
Strategic guidance for designing modern data platforms, covering storage paradigms (data lake, warehouse, lakehouse), modeling approaches (dimensional, normalized, data vault, wide tables), data mesh principles, and medallion architecture patterns. Use when architecting data platforms, choosing b...
majiayu000 / big-data
Apache Spark, Hadoop, distributed computing, and large-scale data processing for petabyte-scale workloads
majiayu000 / bigquery-expert
BigQuery Expert Engineer Skill - Comprehensive guide for GoogleSQL queries, data management, performance optimization, and cost management
majiayu000 / chroma
Open-source embedding database for AI applications. Store embeddings and metadata, perform vector and full-text search, filter by metadata. Simple 4-function API. Scales from notebooks to production clusters. Use for semantic search, RAG applications, or document retrieval. Best for local develop...
majiayu000 / cloud-platforms
AWS, GCP, Azure data platforms, infrastructure as code, and cloud-native data solutions