Large Language Models (LLMs) have revolutionized various domains, but their high computational costs and inefficiencies in repetitive tasks pose significant challenges. My work explores two optimization techniques—batching and caching—to reduce costs and improve efficiency. Batching involves grouping multiple queries for simultaneous processing, with methods such as independent query grouping and single-example followed by batched queries. Caching leverages reusable outputs through techniques like full prompt or prefix caching, key-value caching, and similarity-based caching. I evaluated the batching techniques on different workloads and context sizes and found that a consistent reduction in monetary costs is seen, ranging from 1.5 to 4 times. Future work involves integrating advanced batching techniques, caching methods, and other optimization strategies into end-to-end agentic systems, as these approaches, when applied to LLM workflows—such as autonomous systems that generate, evaluate, and act on outputs—demonstrate significant cost savings, thereby bridging the gap between LLM performance and real-world constraints to enhance accessibility and efficiency.