arXiv LLMs Assistant
Applied exllamav2 6.0bpw quantization to run Mixtral-8x7B on personal GPUs with virtually no loss in quality and evaluate RAG pipeline configurations to find the best performing one (evaluation score ~90%).
Built an assistant to study the LLM domain, compare out-of-repository papers with papers in a personal repo (Zotero) and recommend recent papers to read along with lists of question/answer pairs.
Orchestrated a multi-faceted evaluation of RAG setups, implementing feedback mechanisms that led to the selection of the most effective configuration; this process improved question relevance, elevating overall project outcomes.
Built a command line utility to ask questions about the contents of uploaded papers and generate question/answer sets based on these papers.