Skip to content

edwhittle/salesforce-spring-poc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Salesforce POC - Spring Boot Application with Lucene Search

Overview

This Spring Boot application provides fast search capabilities for a large product dataset (400K+ products) using both traditional database queries and Apache Lucene full-text search indexing.

Features

  • Product Management: Complete CRUD operations for products
  • CSV Import: Bulk import of products from CSV files
  • Lucene Search: High-performance full-text search with sub-second response times

Performance Benefits

For a dataset of 400K products:

  • Database Search: 2-30 seconds depending on complexity
  • Lucene Search: 5-50 milliseconds (50-600x faster)

API Endpoints

Product Management

  • GET /api/productBySupplier/{supplierIds}?brandSearch={brand}&itemDescriptionSearch={description}&limit={limit} - Advanced supplier search with fuzzy filters

Index Management

  • POST /api/search/index/rebuild - Rebuild Index
  • GET /api/search/index/stats - Get Status of Lucene Index

Searchable Fields

Primary Index (Optimized for Performance):

  • supplier: Supplier ID (primary search field - fastest performance)

Secondary Index:

  • productId: Product identifier
  • itemDescription: Product description (searchable but not optimized)

Stored Fields (retrievable but not indexed for search):

  • supplierGroupId: Supplier group identifier
  • smktsMerchCategory: Merchandise category
  • liqMerchCategory: Liquor merchandise category
  • digitalBrandName: Digital brand name
  • subBrandName: Sub-brand name

Quick Start

1. Build and Run

mvn clean install
mvn spring-boot:run

2. Application will automatically:

  • Import CSV data from src/main/resources/data-all.csv
  • Create H2 database with optimized settings for 400K products
  • Build Lucene search index
  • Start web server on port 8080

Docker

The containerisation of this application is based upon the azul/zulu-openjdk:21 image.

  • To build the container, run docker build -t springio/salesforce-poc-springboot .
  • To start the container, run docker run -p 8080:8080 -t springio/salesforce-poc-springboot

Configuration

Application Properties

The application is optimized for large datasets with:

  • Persistent H2 Database: Data survives application restarts
  • Connection Pooling: 20 max connections, 5 minimum idle
  • Batch Processing: 50 records per batch for optimal performance
  • Second-level Caching: Enabled for frequently accessed data
  • JPA Optimizations: Batch inserts, query optimization

Memory Configuration

For 400K products, recommended JVM settings:

java -Xms2g -Xmx6g -jar salesforce-poc-0.0.1-SNAPSHOT.jar

Search Examples

Single Supplier Search

curl "http://localhost:8080/api/productBySupplier/12345"

Multiple Supplier Search with Filters

# Advanced search: suppliers + brand + item description (fuzzy search)
curl "http://localhost:8080/api/productBySupplier/959609,980801?brandSearch=Coles&itemDescriptionSearch=FREE&limit=10"

# Search by supplier and brand only
curl "http://localhost:8080/api/productBySupplier/959609?brandSearch=Taste&limit=10"

# Search by supplier and description only (fuzzy search)
curl "http://localhost:8080/api/productBySupplier/959609?itemDescriptionSearch=CRACKER&limit=10"

# Just supplier search (no filters)
curl "http://localhost:8080/api/productBySupplier/959609?limit=5"

Database Access

H2 Console available at: http://localhost:8080/h2-console

  • JDBC URL: jdbc:h2:file:./data/productdb
  • Username: sa
  • Password: (empty)

Troubleshooting

Memory Issues

For large datasets, increase JVM heap size:

export MAVEN_OPTS="-Xms2g -Xmx6g"
mvn spring-boot:run

Resource Requirements

Minimum System Requirements

  • RAM: 4GB (2GB for application, 2GB for OS)
  • Storage: 5GB (3GB for database, 1GB for index, 1GB for application)
  • CPU: 2 cores minimum, 4+ recommended for optimal performance

Production Recommendations

  • RAM: 8GB+ (6GB for application heap)
  • Storage: SSD recommended for database and index files
  • CPU: 4+ cores for concurrent search operations

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors