From PDF to DEXPI - Making Engineering Data Machine-Readable
By Anand George
Introduction: PDFs Aren’t the Final Form
P&IDs are often handed over as PDFs — a format that’s perfect for printing, but terrible for automation. Information that looks clear to the human eye becomes inaccessible to software: symbols are flattened into pixels, text is no longer searchable, and structural relationships are lost.
To unlock true automation, data needs to move from visual form to a machine-readable format — and that’s where DEXPI comes in.
In this post, we’ll explain why converting PDFs to DEXPI matters, how it works, and what benefits it unlocks across your engineering workflows.
Why PDF-Based Workflows Are a Bottleneck
PDFs:
- Are unstructured — no semantic meaning
- Require OCR and symbol detection for any extraction
- Make it hard to validate, revise, or link to downstream systems
This limits automation in:
- Material takeoff (MTO)
- Control system integration
- Safety and compliance tracking
- Digital twin handover
Every step has to be manual unless the data is digitized.
What Is DEXPI?
DEXPI (Data Exchange in the Process Industry) is a standardized format based on ISO 15926. It allows engineering data — including equipment, instrumentation, piping, and control logic — to be stored in a structured, machine-readable way.
Think of it as a smart container for everything your P&ID represents:
- Symbols with meaning (e.g., this is a centrifugal pump)
- Relationships (e.g., this line flows into this vessel)
- Attributes (e.g., size, material, tag number)
DEXPI files can be used by:
- SmartPlant P&ID
- COMOS
- Custom internal tools
- Any system that speaks ISO 15926 or XML/OWL formats
Our Approach: PDF → Graph → DEXPI
At Storm Consulting, we treat each P&ID as a graph:
- Detect symbols using OpenCV or templates
- Extract text via OCR
- Trace lines and chevrons to build flow paths
- Construct a graph: nodes (equipment, instruments), edges (lines, flows_to)
- Export to DEXPI-compliant structure
This ensures:
- Each component is identified by class (e.g., Valve, FT, Pump)
- Properties (e.g., size, service) are retained
- Relationships are stored for downstream processing
The result? A structured, interoperable file ready for automation.
Why This Matters
Moving from PDF to DEXPI enables:
- Automated MTO extraction
- Smart change tracking between revisions
- Integration with 3D modeling or control logic
- Digital twin readiness
It’s not just about making data accessible — it’s about making it useful.
Challenges We Address
- Variability in symbol sets (vendor-specific P&IDs)
- Poor scan quality and OCR accuracy
- Incomplete or inconsistent tag formats
- Mapping complex drawing logic to a linear data model
Our tools are designed to be robust, hybrid, and reviewable — so that what goes into DEXPI is accurate, complete, and validated.
Conclusion: Turn Engineering Drawings into Engineering Data
If your P&IDs live in PDFs, your data is locked in a format that’s hard to query, integrate, or automate. By converting to DEXPI, you give your engineering documents a second life — one where they’re interoperable, machine-readable, and future-ready.
At Storm Consulting, we specialize in helping teams make this transition — one drawing at a time.