Job Description:
The AI LLM Programmer role is dedicated to mastering the latest in Generative AI. You will be responsible for implementing and optimizing Large Language Models to handle complex reasoning, creative generation, and context-aware interactions.
You will work with state-of-the-art open-source and proprietary models, building the architecture (such as Retrieval-Augmented Generation) that makes them useful in real-world scenarios. Your goal is to push the boundaries of what text-based AI can achieve.
This is a fast-paced role where you must stay updated with the weekly changes in the LLM landscape. Working remotely, you will build the “intelligence” layer of our applications, ensuring high accuracy and low hallucination rates.
Responsibilities:
Fine-tune LLMs (Llama, Mistral) using techniques like LoRA/QLoRA.
Develop and optimize RAG pipelines and vector database integrations.
Implement advanced prompt engineering and chain-of-thought workflows.
Optimize model latency and cost-per-token for production.
Preferred Qualifications:
Experience with LangChain, LlamaIndex, or Haystack.
Deep understanding of Attention mechanisms and Transformer architecture.
Proficiency in Python and experience with Vector DBs (Pinecone, Weaviate).
Familiarity with evaluation frameworks for LLMs.

