Log IQ

AI-powered log analytics platform built for modern cloud infrastructure.

10K+

Events / min

99.9%

Uptime

<1s

Delivery

Problem

Developers often spend hours tracing, digging through logs produced in production environment to look for potential threats and anomalies. Spending time across multiple pod logs and tracing the flow of data and error

Solution

Built a log analysis platform that directly consumes logs from cloud provider's logging tools in real time. Uses AI and machine learning algorithms to analyse, debug and find issues, predict load forecasts, detect anomalies and system health. This is an internal project built during a hackathon at Oracle

Architecture

The system follows a modular pod-oriented architechture which consumes data from serveral sources and put together in a single dashboard.

  • One polling job per container - to fetch the logs produced by the application running in the container
  • Storage conatiner to store the logs produced
  • Key information extracted and produced by the system is stored in Oracle 23AI Db
  • Spring Boot APIs enabling communication between the Data Processing and Storage Layer
  • UI Dashboard using react to display the charts
  • Chat Bot integration using MCP servers
  • Using RAG to train the Chat Bot on the Code Base

Technical Highlights

  • Seamlessly ingests structured OCI logs for instant processing and analysis
  • Detects unusual patterns using Isolation Forest and other models
  • Predicts future trends in traffic, latency, and error rates using Prophet
  • Chatbot lets users ask questions about logs in plain English
  • Visualizes key metrics like traffic, errors, and latency in real-time
  • AI-assisted diagnosis helps identify what caused failures or slowdowns
  • Provides intelligent suggestions to fix common issues faster
  • Understand the flow of events leading to errors or performance drops
  • Test system resilience by simulating failures and observing log behavior
  • Links logs across services to trace issues end-to-end
  • Unified view across infrastructure, apps, and services

Tech Stack

Spring BootPythonMLIsolationn ForestProphetTimeseriesLLMMCPLLama

Trade-offs & Lessons Learned

Trade-offs

  • Lot of moving parts, need near zero downtime to provide realtime usage
  • Handling and storing huge volumes of data was a challange

Lessons Learned

  • Maintaining clean and choosing correct architecture was the key
  • Maintaining system performance and low costs helped a lot