Search that finds what you meant: vector databases, explained properly

TL;DR

·An embedding model turns any text into a list of numbers (a vector) where similar meanings land close together. A vector database stores millions of these and answers 'what is nearest to this?' in milliseconds.نموذج التضمين يحول أي نص إلى قائمة أرقام (متجه) تتقارب فيها المعاني المتشابهة. وقاعدة بيانات المتجهات تخزن الملايين منها وتجيب عن سؤال 'ما الأقرب لهذا؟' في أجزاء من الثانية.
·The speed comes from approximate indexes (HNSW): instead of comparing against every vector, the search hops through a graph of neighbourhoods and checks a tiny fraction.السرعة تأتي من الفهارس التقريبية (HNSW): بدل المقارنة مع كل متجه، يقفز البحث عبر شبكة من الجوار ويفحص جزءاً صغيراً جداً.
·In 2026 the practical choice is short: pgvector inside Postgres for most teams (up to ~50M vectors), Qdrant when vectors are the main workload, Milvus only at hundreds of millions.في 2026 الاختيار العملي قصير: pgvector داخل Postgres لمعظم الفرق (حتى نحو 50 مليون متجه)، وQdrant عندما تكون المتجهات هي العمل الأساسي، وMilvus فقط عند مئات الملايين.

Start from the failure, because everyone has lived it. A colleague asks for "the contract with the late delivery penalty". You search the archive for "penalty". Nothing. The contract says "compensation for delayed handover". The document was always there; the search could not see that two different sentences mean the same thing. Every keyword system fails this way, in every language, and it fails harder in Arabic where one root takes dozens of forms. Vector databases exist to fix exactly this failure, and they are the memory behind almost every serious AI system you have seen. Here is how they work, properly.لنبدأ من الفشل، لأن الجميع عاشه. زميل يطلب "العقد الذي فيه غرامة تأخير التسليم". تبحث في الأرشيف عن "غرامة". لا شيء. العقد يقول "تعويض عن تأخر الاستلام". المستند كان موجوداً دائماً؛ البحث هو الذي لم يرَ أن جملتين مختلفتين تعنيان الشيء نفسه. كل بحث بالكلمات يفشل هكذا، في كل لغة، ويفشل في العربية أكثر حيث يأخذ الجذر الواحد عشرات الصور. قواعد بيانات المتجهات وجدت لتحل هذا الفشل بالذات، وهي الذاكرة خلف كل نظام ذكاء اصطناعي جاد رأيته. إليك كيف تعمل، بشرح صحيح.

Step 1: meaning becomes numbers (embeddings)الخطوة 1: المعنى يصبح أرقاماً (التضمين)

An embedding model is a neural network with one job: read a piece of text and output a fixed list of numbers, typically 768 to 1,536 of them, called a vector. The training pushed the model to give similar meanings nearby numbers. "Late delivery penalty" and "compensation for delayed handover" produce vectors that are close; "annual leave policy" lands far away. Think of it as a map with a thousand dimensions instead of two: every sentence gets coordinates, and distance on the map is difference in meaning. Images, audio, and code can be embedded the same way, which is why the same machinery powers photo search and code search.نموذج التضمين شبكة عصبية لها عمل واحد: تقرأ نصاً وتخرج قائمة أرقام ثابتة الطول، عادة بين 768 و1,536 رقماً، تسمى متجهاً. التدريب علم النموذج أن يعطي المعاني المتشابهة أرقاماً متقاربة. "غرامة تأخير التسليم" و"تعويض عن تأخر الاستلام" يعطيان متجهين قريبين؛ أما "سياسة الإجازة السنوية" فتقع بعيداً. تخيلها خريطة بألف بعد بدل بعدين: كل جملة لها إحداثيات، والمسافة على الخريطة هي الفرق في المعنى. والصور والصوت والكود تضمن بالطريقة نفسها، ولهذا تشغل الآلية نفسها البحث في الصور والبحث في الكود.

Step 2: the nearest-neighbour problemالخطوة 2: مشكلة الجار الأقرب

Now your archive is a million vectors and a question arrives: embed the question, then find the stored vectors closest to it (closeness is usually cosine similarity, the angle between vectors). The naive way compares against all million, which works and is exact, but at tens of millions of vectors and thousands of queries per second it collapses. This, precisely, is the problem vector databases solve: nearest-neighbour search at scale, fast.الآن أرشيفك مليون متجه ويصل سؤال: ضمّن السؤال، ثم ابحث عن المتجهات المخزنة الأقرب إليه (القرب عادة هو تشابه جيب التمام، أي الزاوية بين المتجهين). الطريقة الساذجة تقارن مع المليون كله، وهي تعمل ودقيقة، لكنها تنهار عند عشرات الملايين من المتجهات وآلاف الأسئلة في الثانية. هذه بالضبط هي المشكلة التي تحلها قواعد بيانات المتجهات: بحث الجار الأقرب على نطاق واسع، وبسرعة.

Step 3: the trick that makes it fast (HNSW)الخطوة 3: الحيلة التي تصنع السرعة (HNSW)

The dominant index is HNSW (hierarchical navigable small world). Picture the vectors as cities and the index as a road network built in layers: a top layer of long highways between distant regions, lower layers of shorter regional roads, and a bottom layer of local streets connecting true neighbours. A search starts on the highways, gets to roughly the right region in a few hops, then descends through shorter roads until it is walking local streets around the answer. Instead of checking a million vectors it checks a few thousand, and finds the true nearest neighbours 95-99% of the time. That deliberate trade, a sliver of exactness for a thousandfold speedup, is the entire engineering soul of a vector database. The dials (how many roads per city, how wide the search walks) let you trade speed against recall per query.الفهرس السائد هو HNSW (الشبكة الهرمية سهلة التنقل). تخيل المتجهات مدناً والفهرس شبكة طرق مبنية طبقات: طبقة عليا من طرق سريعة طويلة بين المناطق البعيدة، وطبقات أدنى من طرق إقليمية أقصر، وطبقة سفلى من شوارع محلية تربط الجيران الحقيقيين. يبدأ البحث على الطرق السريعة فيصل إلى المنطقة الصحيحة تقريباً في قفزات قليلة، ثم ينزل عبر الطرق الأقصر حتى يمشي في الشوارع المحلية حول الجواب. بدل فحص مليون متجه يفحص بضعة آلاف، ويجد الجيران الأقرب الحقيقيين في 95-99% من الحالات. هذه المقايضة المقصودة، ذرة من الدقة مقابل سرعة بألف ضعف، هي الروح الهندسية لقاعدة بيانات المتجهات كلها. ولها مقابض ضبط (كم طريقاً لكل مدينة، وكم يتوسع البحث) تبادل بها السرعة بالدقة في كل سؤال.

What a real one does beyond the indexماذا تفعل القاعدة الحقيقية غير الفهرس

Production needs more than nearest-neighbour. Metadata filtering: "nearest contracts, but only this supplier, after 2024", which the database must apply without wrecking index performance. Hybrid search: combining vector similarity with classic keyword scoring, because exact codes like "INV-4471" must still match exactly. Quantization: compressing vectors to a quarter of their memory at almost no recall cost, which decides your hardware bill. And the unglamorous rest: updates and deletes without rebuilding the index, backups, and access control. The index is the soul; this list is the job.بيئة الإنتاج تحتاج أكثر من الجار الأقرب. التصفية بالبيانات الوصفية: "أقرب العقود، لكن من هذا المورد فقط، وبعد 2024"، وعلى القاعدة تطبيقها دون تخريب أداء الفهرس. البحث الهجين: جمع تشابه المتجهات مع البحث الكلاسيكي بالكلمات، لأن الرموز الدقيقة مثل "INV-4471" يجب أن تطابق حرفياً. الضغط: تصغير المتجهات إلى ربع ذاكرتها بخسارة دقة لا تذكر تقريباً، وهو ما يحدد فاتورة الأجهزة. ثم البقية غير اللامعة: تحديث وحذف دون إعادة بناء الفهرس، ونسخ احتياطي، وصلاحيات وصول. الفهرس هو الروح؛ وهذه القائمة هي الوظيفة.

The 2026 landscape, measuredخريطة 2026، بالقياس

Typical query latency, open-source vector stores (2026 benchmarks)زمن الاستجابة المعتاد، مخازن المتجهات المفتوحة (قياسات 2026)

milliseconds, typical queryميلي ثانية للسؤال المعتاد · lower is betterالأقل أفضل

Qdrant · purpose-built, Rustمبنية خصيصاً للمتجهات 12

Milvus · built for billions of vectorsمبنية لمليارات المتجهات 18

Chroma · great for prototypesممتازة للتجارب الأولى 30

pgvector · inside Postgres, zero new infrastructureداخل Postgres، دون أي نظام جديد 33

Source:المصدر: 2026 vector database benchmarks (checked June 12, 2026)

Read that chart carefully, because the slowest bar is the usual right answer. pgvector lives inside Postgres: your documents, their metadata, and their vectors share one database, one transaction, one backup, and SQL filtering, with no second system to operate. Benchmarks published this year show Postgres with the pgvectorscale extension sustaining 471 queries per second at 99% recall on 50 million vectors, an order of magnitude beyond what most businesses will ever need. The practical decision tree in 2026 is one sentence: pgvector if you run Postgres and have under ~50M vectors; Qdrant when vector search is the product itself; Milvus only when you are genuinely at hundreds of millions.اقرأ الرسم بانتباه، لأن العمود الأبطأ هو الجواب الصحيح في العادة. pgvector يعيش داخل Postgres: مستنداتك وبياناتها الوصفية ومتجهاتها في قاعدة واحدة، ومعاملة واحدة، ونسخة احتياطية واحدة، وتصفية بلغة SQL، دون نظام ثانٍ تديره. والقياسات المنشورة هذه السنة تظهر Postgres مع إضافة pgvectorscale يجيب عن 471 سؤالاً في الثانية بدقة 99% على 50 مليون متجه، أي أضعاف ما ستحتاجه معظم الشركات يوماً. وشجرة القرار العملية في 2026 جملة واحدة: pgvector إن كنتم تشغلون Postgres وعندكم أقل من نحو 50 مليون متجه؛ وQdrant عندما يكون بحث المتجهات هو المنتج نفسه؛ وMilvus فقط عند مئات الملايين فعلاً.

471 QPS

Queries per second at 99% recall on 50 million vectors: Postgres with pgvectorscale, on one machine, in 2026 published benchmarks. The boring option got fast.سؤالاً في الثانية بدقة 99% على 50 مليون متجه: Postgres مع pgvectorscale على جهاز واحد، في قياسات منشورة عام 2026. الخيار الممل أصبح سريعاً.

Back to the contractعودة إلى العقد

The colleague's question that opened this post becomes: embed the archive once, store vectors next to the records, and the search for "late delivery penalty" returns the contract that says "compensation for delayed handover", first result, with the supplier filter applied. That is the whole promise: search that finds what you meant. And every piece of it self-hosts inside your own network.سؤال الزميل الذي فتحنا به المقال يصبح: ضمّنوا الأرشيف مرة واحدة، وخزنوا المتجهات بجانب السجلات، فيرجع البحث عن "غرامة تأخير التسليم" العقد الذي يقول "تعويض عن تأخر الاستلام" في النتيجة الأولى، مع تصفية المورد. هذا هو الوعد كله: بحث يجد ما قصدته. وكل قطعة فيه تعمل على خوادمكم داخل شبكتكم.

Shareشارك

Search that finds what you meant: vector databases, explained properlyبحث يجد ما قصدته: قواعد بيانات المتجهات بشرح صحيح

Step 1: meaning becomes numbers (embeddings)الخطوة 1: المعنى يصبح أرقاماً (التضمين)

Step 2: the nearest-neighbour problemالخطوة 2: مشكلة الجار الأقرب

Step 3: the trick that makes it fast (HNSW)الخطوة 3: الحيلة التي تصنع السرعة (HNSW)

What a real one does beyond the indexماذا تفعل القاعدة الحقيقية غير الفهرس

The 2026 landscape, measuredخريطة 2026، بالقياس

Put one workflow into production.ضعوا عمليةً واحدة في الإنتاج.

Search that finds what you meant: vector databases, explained properlyبحث يجد ما قصدته: قواعد بيانات المتجهات بشرح صحيح

Step 1: meaning becomes numbers (embeddings)الخطوة 1: المعنى يصبح أرقاماً (التضمين)

Step 2: the nearest-neighbour problemالخطوة 2: مشكلة الجار الأقرب

Step 3: the trick that makes it fast (HNSW)الخطوة 3: الحيلة التي تصنع السرعة (HNSW)

What a real one does beyond the indexماذا تفعل القاعدة الحقيقية غير الفهرس

The 2026 landscape, measuredخريطة 2026، بالقياس

Put one workflow into production.ضعوا عمليةً واحدة في الإنتاج.

The high tier of open-weight models, June 2026الفئة العليا من النماذج مفتوحة الأوزان، يونيو 2026

The state of Arabic OCR in 2026: what actually worksواقع التعرف الضوئي على النصوص العربية في 2026: ما الذي ينجح فعلاً