Skip to content

Getting Started

Small Language Model (SLM) for Pharmaceutical Information

Master's thesis in Computer Science at the Federal University of Paraná (UFPR), focusing on the development of a specialized language model using drug package inserts and medical databases.

🎯 Overview

DrugSLM is a research project that aims to create a domain-specific language model trained on pharmaceutical data from regulatory agencies.

🚀 Quick Start

Prerequisites

  • Docker & Docker Compose
  • uv package manager
  • make

Installation

# Clone the repository
git clone https://github.com/vinnylg/drugslm.git
cd drugslm

# Start services
docker compose up -d dev       # Base container
docker compose up -d mkdocs    # Documentation server (localhost:8000)

# Attach to container
docker exec -it drugslm-dev bash

📁 Project Structure

Project structure documentation coming soon.

🛠️ Development Workflow

Development workflow documentation coming soon.

📝 Citation

If you use this work in your research, please cite:

@mastersthesis{goncalves2026drugslm,
  author  = {Gonçalves, Vinícius de Lima},
  title   = {DrugSLM: A Small Language Model for Pharmaceutical Information},
  school  = {Federal University of Paraná},
  year    = {2026},
  type    = {Master's Thesis},
  address = {Curitiba, Brazil}
}

👤 Author

Vinícius de Lima Gonçalves
Master's Student in Computer Science
Federal University of Paraná (UFPR)
Curitiba, Brazil