cv

A comprehensive overview of my academic background, research experience in LLMs and reasoning, and my professional career as an SMTS and ML engineer.

Basics

Name Obed Junias
Label Computer Science Graduate Student & NLP Researcher
Email [email protected]
Phone 720-266-0046
Url https://linkedin.com/in/obed-junias
Summary I am a graduate researcher at the BLAST Lab exploring natural language reasoning, LLM safety, and agentic systems. My primary research investigates the fundamental reliability of generative models, focusing on data quality and model collapse during foundation model pretraining. Following my work on interpretable commonsense reasoning (ACL 2026), my overarching vision is to pioneer advancements that ensure next-generation AI systems remain capable, transparent, logically grounded, and aligned with human outcomes.

Work

  • 2021.08 - 2024.08
    Senior Member of Technical Staff
    Oracle Corporation
    Architected and implemented TestNG and BATS automation frameworks for Oracle Rest Data Services and SQLcL, reducing manual effort by 95%. Engineered Python-based migration framework for Oracle APEX applications supporting 200+ employees.
    • Reduced manual testing effort by 95%
    • Led 15-member team for DB Tools feature validation
    • Automated Oracle APEX application migrations
  • 2021.02 - 2021.07
    Machine Learning Engineer Intern
    Hewlett Packard Enterprise
    Built intelligent automation solution with Groovy and Workfusion OCR, optimizing data workflows and integrating text extraction into software robots, increasing speed by 40% and reducing errors by 80%.
    • Increased automation speed by 40%
    • Reduced errors by 80%
    • Delivered PoC on Intelligent Business Process Management

Education

  • 2024.08 - 2026.05

    Boulder, Colorado

    Master of Science
    University of Colorado Boulder
    Computer Science
    • Machine Learning
    • Data Center Scale Computing
    • NLP and Deep NLU
  • 2017.08 - 2021.08

    Bangalore, India

    Bachelor of Engineering
    BMS College of Engineering
    Computer Science
    • Algorithms
    • Databases
    • Operating Systems
    • Computer Networks

Skills

Programming Languages
Python
C++
Java
Rust
SQL
AI/ML Frameworks
PyTorch
TensorFlow
Keras
Transformers
LangChain
Scikit-Learn
Cloud Technologies
GCP
Oracle Cloud (OCI)
AWS
CI/CD

Languages

English
Fluent

Interests

Machine Learning & AI
NLP Research
Reasoning
LLM Safety
Responsible AI
Agentic AI
Systems

Projects

  • 2026.02 - Present
    Measuring Model Collapse under Recursive Summarization Training
    Investigating the fundamental limits of foundation model training under autophagous (self-consuming) feedback loops. Developing a robust measurement framework to detect early signals of distributional decay and knowledge fidelity loss.
    • Designing high-fidelity recursive training pipelines to isolate causal drivers of model collapse
    • Investigating the theoretical foundations of distributional shift in synthetic data dominant regimes
    • Developed entropy-based measurement frameworks to detect early signals of systemic output degradation
    • Analyzing the catalytic role of summarization tasks in accelerating semantic-level information loss
  • 2025.06 - Present
    Commonsense Reasoning with Logical Entailment Trees
    Developed benchmarks and evaluation methods for logical commonsense reasoning to evaluate multi-fact logical capabilities of large language models.
    • Created LOGICAL-COMMONSENSEQA, a novel benchmark dataset testing compositional logic
    • Benchmarked baseline performance with N-Shot and Chain-of-Thought prompting
    • Designed neuro-symbolic methods combining informal logic with neural approaches
    • Paper accepted to ACL 2026 (Main Conference)
  • 2024.01 - 2024.12
    ScoutR: Agentic Football Transfer Intelligence Platform
    Designed and shipped an agentic AI transfer intelligence platform that monitors player data across leagues and surfaces candidates aligned to club tactical and financial constraints. Built multi-agent pipelines with Gemini/Claude and Pydantic, and implemented RAG with ChromaDB over StatsBomb event data.
    • LangGraph-based multi-agent pipeline with Gemini and Pydantic for complex reasoning
    • RAG pipeline over StatsBomb event data using ChromaDB for grounded evaluation
    • Fault-tolerant 3-tier inference layer (Gemini -> Claude -> fallback) for reliability
    • Real-time streaming agent execution (SSE) for monitoring long-horizon tasks
  • 2024.01 - 2024.12
    Medical Ethics Assessment of LLMs
    Systematic assessment of ethical reasoning capabilities of large language models in clinical contexts using RAG pipeline and controlled evaluation with curated multiple-choice questions.
    • RAG Pipeline Implementation
    • Ethical Reasoning Evaluation
  • 2024.01 - 2024.06
    Lost in Plot: Contrastive Learning for Movie Retrieval
    Dense retrieval system using fine-tuned BERT encoder with contrastive learning for tip-of-the-tongue movie retrieval, outperforming GPT-4 baseline on 100K+ movie corpus.
    • Contrastive Learning
    • BERT Fine-tuning
    • Dense Retrieval
  • 2024.06 - 2024.12
    Resource-Efficient LLM Fine-tuning for Mental Health Support
    Applied parameter-efficient fine-tuning with QLoRA to adapt Falcon 7B model for mental health conversations with limited computational resources.
    • QLoRA Implementation
    • Mental Health NLP
  • 2024.03 - 2024.08
    Multi-stage RAG System
    Robust RAG system with FAISS indexing and Scikit-Learn context filtering for enhanced factual grounding of LLM responses over research paper corpus.
    • FAISS Integration
    • Multi-stage Retrieval