tylertitsworth
Are you tylertitsworth? Claim your skills.
tylertitsworth / aiconfigurator
Optimizes LLM serving configurations for NVIDIA GPUs, enhancing deployment efficiency and performance through advanced parallelism and quantization strategies.
tylertitsworth / argocd
Facilitates GitOps continuous delivery for Kubernetes, managing applications and sync policies effectively.
tylertitsworth / cilium
Cilium provides eBPF-based networking, security, and observability for Kubernetes, enhancing network policies and troubleshooting capabilities.
tylertitsworth / deepspeed
Enables efficient training and inference of large models with advanced memory optimization and HuggingFace integration.
tylertitsworth / fsdp
Enables efficient training of large PyTorch models using Fully Sharded Data Parallel (FSDP) for optimal GPU memory management.
tylertitsworth / gpu-operator
Automates NVIDIA GPU management on Kubernetes, enabling efficient resource allocation and monitoring for GPU infrastructure.
tylertitsworth / helm
Facilitates advanced Helm chart authoring for Kubernetes, enabling complex deployments with library charts and schema validation.
tylertitsworth / kuberay
Facilitates deploying Ray on Kubernetes with features like GPU scheduling, autoscaling, and observability for distributed workloads.
tylertitsworth / leaderworkerset
Facilitates multi-node ML inference and training on Kubernetes, ensuring cohesive pod management and efficient resource utilization.
tylertitsworth / longhorn
Manages distributed block storage on Kubernetes with features like snapshots, S3 backup, and Prometheus monitoring.
tylertitsworth / model-formats
Explains ML model serialization formats and their security implications, aiding in format selection and conversion for various frameworks.
tylertitsworth / nccl
Optimizes multi-GPU communication with NVIDIA NCCL for efficient data transfer and performance tuning in distributed systems.
tylertitsworth / nvidia-dynamo
NVIDIA Dynamo orchestrates distributed LLM inference, optimizing multi-GPU deployments with intelligent routing and autoscaling capabilities.
tylertitsworth / nvidia-nim
Optimizes and configures NVIDIA NIM inference microservices for efficient deployment and hardware-aware selection of models.
tylertitsworth / ray-core
Facilitates distributed computing in Python using Ray Core for high-performance tasks and actors, enhancing parallel processing capabilities.
tylertitsworth / ray-train
Facilitates distributed training with Ray, enabling fault tolerance and integration with HuggingFace for efficient model training.
tylertitsworth / torch-compile
Optimizes PyTorch models using torch.compile and TorchInductor for enhanced training and inference performance.
tylertitsworth / triton-inference-server
Facilitates model serving with NVIDIA Triton Inference Server, enabling efficient management and deployment of AI models.
tylertitsworth / aws-efa
Configures AWS EFA for optimized distributed GPU training on EKS, enhancing performance with RDMA and SRD protocol.
tylertitsworth / aws-fsx
Optimizes AWS FSx for Lustre configurations for high-performance ML storage on EKS, enhancing data throughput and efficiency.