We use cookies. Find out more about it here. By continuing to browse this site you are agreeing to our use of cookies.
#alert
Back to search results
New

Data Engineer

Spectraforce Technologies
Sep 26, 2025
Title: Data Engineer

Duration: 3 Months

Location: Chicago, IL (Remote)

About the Project


We're launching a focused 3-month initiative to:

  1. Bulk-ingest over 50,000 supplier contracts into SAP Ariba, with metadata extraction powered by OCR.
  2. Design and implement the database architecture and data flows that will feed our Negotiation AI-including contract detail extraction and supplier spend analytics.


This work currently runs separately from the Negotiation AI MVP, but must be future-ready for seamless integration.

Role Overview

As our Data Engineer, you will own the end-to-end data pipelines. This includes designing scalable databases, developing ingestion workflows, collaborating with our internal Machine Learning Engineering team, and structuring supplier spend data. You'll work closely with the Full Stack Developer to co-design the database schema for the Negotiation AI and ensure future compatibility with the ingestion pipeline.

Key Deliverables

  • Ingestion Pipeline: Build and deploy a robust ETL/ELT pipeline using Azure to ingest 50,000+ contracts.
  • Metadata Extraction: Configure and run OCR workflows (e.g., OlmOCR/Azure Document Intelligence) to extract key contract fields such as dates, parties, terms etc.
  • Scalable Database Schema: Design and implement a schema in Azure PostgreSQL to store contract metadata, OCR outputs, and supplier spend data. Collaborate with the Software Developer to design a future-ready schema for AI consumption.


Required Skills & Experience

Data Engineering & ETL/ELT

  • Experience with Azure PostgreSQL or similar relational databases
  • Skilled in building scalable ETL/ELT pipelines (preferably using Azure)
  • Proficient in Python for scripting and automation


OCR Collaboration

  • Ability to work with internal Machine Learning Engineering teams to validate and structure extracted data
  • Familiarity with OCR tools (e.g., Azure Document Intelligence, Tesseract) is a plus


SAP Ariba Integration

  • Exposure to cXML, ARBCI, SOAP/REST protocols is a plus


  • Comfortable with API authentication (OAuth, tokens) and enterprise-grade security


Agile Collaboration & Documentation

  • Comfortable working in sprints and cross-functional teams
  • Able to use Github Copilot to document practices for handover


Preferred Qualifications

  • Experience with large-scale contract ingestion projects
  • Familiarity with procurement systems and contract lifecycle management
  • Background in integrating data pipelines with AI or analytics platforms

Applied = 0

(web-759df7d4f5-28ndr)