LinkedIn's HLTM: A Hierarchical Long-Term Semantic Memory System for LLM-Powered Hiring Agents

Large Language Model (LLM) agents are increasingly integrated into real-world products, where personalized and context-aware user interactions are paramount. A fundamental component enabling these capabilities is the agent's long-term semantic memory system. This system is designed to extract implicit and explicit signals from noisy longitudinal behavioral data, store them in a structured format, and facilitate low-latency retrieval.

However, developing industrial-grade long-term memory for LLM agents presents five key challenges: scalability, low-latency retrieval, privacy constraints, cross-domain generalizability, and observability.

To address these issues, LinkedIn introduces the Hierarchical Long-Term Semantic Memory (HLTM) framework. HLTM organizes textual data into a schema-aligned memory tree, which captures semantic knowledge at multiple levels of granularity. This architecture enables scalable ingestion, privacy-aware storage, low-latency retrieval, and transparent provenance. Furthermore, HLTM incorporates an adaptation mechanism to generalize across diverse use cases.

Extensive evaluations conducted on LinkedIn's Hiring Assistant demonstrate that HLTM significantly improves both answer correctness and retrieval F1 by over 10%. Concurrently, it substantially advances the Pareto frontier between query and indexing latency. HLTM has been successfully deployed in LinkedIn's Hiring Assistant, powering core personalization features within production hiring workflows.