EasyCloudify
Solutions
  • Cloud PlatformImprove team productivity and integrate popular workflow applications.
  • Cloud Servers (VPS)NVMe SSD servers deployed in under 60 seconds.
  • Object StorageS3-compatible storage with built-in global CDN.
  • Managed DatabasesManaged PostgreSQL, MySQL, MongoDB, Valkey, Kafka & OpenSearch.
  • Managed WordPressManaged WordPress hosting, so you can focus on your business.
  • MarketplaceFind an app that suits you, then spin it up in 60 seconds or less.
  • Mail HostingPrivacy First Email Hosting for your business.
  • SEO & AI Visibility AuditAudit your site for SEO and AI answer engine visibility.
  • SecurityRock-solid application security for your peace of mind.
  • Register DomainsRegister your domain with us and get started.
Company
  • About
  • Legal
Resources
  • Blog
  • Guides
  • Status
Get Started
  • Contact Sales
  • Pricing
  • Dashboard
EasyCloudifyEasyCloudify
PricingContact
Log inStart deploying
EasyCloudify logoEasyCloudify

Fully managed cloud infrastructure — deploy in minutes, not days.

Newsletter

The latest news, articles, and resources — delivered weekly.

Product

  • Cloud Platform
  • Marketplace
  • Managed WordPress
  • Mail Hosting
  • Security

Support

  • Open a Ticket
  • Documentation
  • Contact Sales
  • System Status

Company

  • About
  • Global Infrastructure
  • Blog
  • Pricing

Legal

  • Terms of Service
  • Privacy Policy
  • Acceptable Use
  • All Legal Docs

  • Cloud Platform
  • Marketplace
  • Managed WordPress
  • Mail Hosting
  • Security

  • Open a Ticket
  • Documentation
  • Contact Sales
  • System Status

8 The Green, Suite A, Dover DE 19901, USA
+1 (302) 534-3122

© 2026 EasyCloudify LLC. All rights reserved.

Rated on Trustpilot
Terms of ServicePrivacy PolicyAcceptable Use
EasyCloudifyDocs
⌘K
Managed Databases — Overview, Engines & PlansCreate a Database Cluster — Step-by-StepConnect to Your Managed Database — URIs, SSL & DriversDatabase Users & Logical DatabasesConnection Pooling — PgBouncer for PostgreSQL & MySQLDatabase Firewall & Trusted SourcesDatabase Monitoring & Performance InsightsDatabase Backups & RestoreDatabase Read Replicas — Scale Reads & Isolate WorkloadsDatabase Log Forwarding — Datadog, OpenSearch, Papertrail, rsyslogDatabase Events — Cluster Activity TimelineDatabase Maintenance & Scaling — Window, Resize, Migrate, DestroyPostgreSQL Vector Search with pgvector
HomeDocsManaged DatabasesPostgreSQL Vector Search with pgvector
4 min read·Updated 2026-05-23

PostgreSQL Vector Search with pgvector

No new cluster required. Every EasyCloudify PostgreSQL cluster (version 13 or later) ships with pgvector and pgvectorscale pre-installed. Enable them with a single SQL command on any existing database.

What Are pgvector and pgvectorscale?

ExtensionAvailable onWhat it adds
vector (pgvector)PostgreSQL 13+vector, halfvec, sparsevec column types · exact and approximate nearest-neighbour search · HNSW and IVFFlat indexes
vectorscale (pgvectorscale)PostgreSQL 14+StreamingDiskANN index · Statistical Binary Quantization (SBQ) for large, disk-resident workloads

Both extensions share the same connection string, backups, and read replicas as the rest of your cluster — no separate vector database infrastructure to manage.

Enable the Extensions

Connect to any database inside your cluster with psql (or any SQL client):

-- Required: pgvector (all PG 13+ clusters)
CREATE EXTENSION vector;

-- Optional: pgvectorscale for large workloads (PG 14+ clusters)
CREATE EXTENSION vectorscale;

Verify installation:

\dx
-- Should list: vector, vectorscale

Quick Start — Storing and Querying Embeddings

1. Create a table with a vector column

-- 1536 dimensions = OpenAI text-embedding-3-small / text-embedding-ada-002
-- Adjust to match your embedding model's output dimension
CREATE TABLE documents (
  id        BIGSERIAL PRIMARY KEY,
  content   TEXT        NOT NULL,
  embedding VECTOR(1536)
);

2. Insert embeddings

Generate embeddings in your application pipeline (pgvector does not generate embeddings — pass pre-computed vectors as parameters):

import openai, psycopg2

conn = psycopg2.connect("postgresql://...")
cur = conn.cursor()

response = openai.embeddings.create(
    model="text-embedding-3-small",
    input="EasyCloudify managed PostgreSQL with pgvector"
)
embedding = response.data[0].embedding  # list of 1536 floats

cur.execute(
    "INSERT INTO documents (content, embedding) VALUES (%s, %s)",
    ("EasyCloudify managed PostgreSQL with pgvector", embedding)
)
conn.commit()

3. Build an HNSW index

Create the index after bulk-loading data for best performance:

-- Cosine distance (best for normalised OpenAI / Cohere embeddings)
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);

-- Euclidean distance (L2)
-- CREATE INDEX ON documents USING hnsw (embedding vector_l2_ops);

-- Inner product (dot product)
-- CREATE INDEX ON documents USING hnsw (embedding vector_ip_ops);

4. Nearest-neighbour search

-- Top-5 most similar documents to a query embedding
SELECT id, content, 1 - (embedding <=> $1::vector) AS similarity
FROM documents
ORDER BY embedding <=> $1::vector
LIMIT 5;

Pass the query embedding as a bound parameter from your application — never concatenate raw vectors into SQL strings.

Hybrid Search (Full-text + Vector)

Combine keyword precision with semantic recall:

SELECT
  d.id,
  d.content,
  ts_rank(to_tsvector('english', d.content), query) AS keyword_score,
  1 - (d.embedding <=> $1::vector)                  AS semantic_score
FROM documents d,
     plainto_tsquery('english', $2) AS query
WHERE to_tsvector('english', d.content) @@ query
ORDER BY semantic_score DESC
LIMIT 10;

Large Workloads with pgvectorscale

For datasets in the tens of millions of vectors or higher, use the diskann index from pgvectorscale instead of HNSW:

-- StreamingDiskANN: low memory footprint, high recall on disk-resident data
CREATE INDEX ON documents USING diskann (embedding);

-- Statistical Binary Quantization for maximum compression
CREATE INDEX ON documents USING diskann (embedding)
  WITH (quantizer = 'SBQ');

Query syntax is identical to pgvector — the index type is transparent to your application.

Choosing the Right Index

ScenarioRecommended index
Dataset fits in RAM, < 5 M rowsHNSW (vector)
Dataset exceeds RAM, 5 M – 100 M+ rowsStreamingDiskANN (diskann)
Maximum storage compression neededdiskann + SBQ
Exact brute-force search (small tables)No index (sequential scan)

Upgrade Extension Versions

Extensions are kept at their installed version until you explicitly upgrade:

ALTER EXTENSION vector UPDATE;
ALTER EXTENSION vectorscale UPDATE;

Check current versions:

SELECT name, default_version, installed_version
FROM pg_available_extensions
WHERE name IN ('vector', 'vectorscale');

When to Use PostgreSQL vs OpenSearch for Vectors

PostgreSQL + pgvectorOpenSearch
Best forVectors alongside relational data, existing PG schemasHybrid keyword + vector search as a first-class feature, log analytics
Dataset sizeUp to tens of millions of rowsHundreds of millions+
ToolingStandard SQL, migrations, ORMsOpenSearch Dashboards, k-NN plugin
Cluster typeStarter or HAStarter or HA
Extra costNone — same clusterSeparate OpenSearch subscription

If you already run PostgreSQL for your application data, adding pgvector is almost always the right first choice. Migrate to a dedicated vector store only if you outgrow pgvector's limits.

Connection Example (Node.js / pg)

import { Pool } from 'pg'

const pool = new Pool({ connectionString: process.env.DATABASE_URL })

async function similarDocuments(queryEmbedding: number[], limit = 5) {
  const { rows } = await pool.query(
    `SELECT id, content, 1 - (embedding <=> $1::vector) AS similarity
     FROM documents
     ORDER BY embedding <=> $1::vector
     LIMIT $2`,
    [`[${queryEmbedding.join(',')}]`, limit]
  )
  return rows
}

Next Steps

  • Connect to Your Database — get your connection URI
  • Database Users & Databases — create a dedicated role for your vector workload
  • Connection Pooling — PgBouncer for high-concurrency embedding ingestion
  • Database Monitoring — watch index build progress and query performance
PreviousDatabase Maintenance & Scaling — Window, Resize, Migrate, DestroyNextApp Emails — Transactional Email Setup and Statistics on {{app_name}}
On this page
  • What Are pgvector and pgvectorscale?
  • Enable the Extensions
  • Quick Start — Storing and Querying Embeddings
  • 1. Create a table with a vector column
  • 2. Insert embeddings
  • 3. Build an HNSW index
  • 4. Nearest-neighbour search
  • Hybrid Search (Full-text + Vector)
  • Large Workloads with pgvectorscale
  • Choosing the Right Index
  • Upgrade Extension Versions
  • When to Use PostgreSQL vs OpenSearch for Vectors
  • Connection Example (Node.js / pg)
  • Next Steps

Was this helpful?

AI Tools