Converting a KMZ pipeline file into a thermodynamic flowsheet requires extracting the geographic route geometry, projecting it into a coordinate system that gives you accurate distances and elevation changes, segmenting the route at hydraulically significant breaks (diameter changes, pig launcher/receiver locations, valve stations, elevation inflection points), and then assigning pipe specifications and fluid properties to each segment. The result is a multi-segment pipeline model you can solve for pressure, temperature, and flow rate at every node.
The short answer
The workflow in five steps:
- Extract KML from KMZ — a KMZ file is a ZIP archive. Unzip it to get one or more
.kmlfiles. Parse the KML XML to extract the pipeline centerline as a sequence of (lon, lat, elevation) coordinate triples. - Project to a local coordinate system — use
pyprojto convert geographic coordinates (WGS84) to a projected CRS (e.g., UTM or a state plane) that gives you distances in meters rather than degrees. Compute the cumulative distance along the pipeline centerline. - Segment the route — divide the centerline into pipe segments at: (a) diameter or wall thickness changes, (b) pig traps and valve stations, (c) compressor or pump stations, (d) significant elevation inflection points (user-defined threshold, e.g., ±30 ft change in a span).
- Assign pipe and fluid properties — for each segment, assign internal diameter, roughness, fluid composition, and thermal properties (ambient temperature, insulation). Use the CalebBell
fluidslibrary for friction factor calculations (Colebrook-White or explicit approximations). - Solve the hydraulic model — compute pressure and temperature at each node sequentially or iteratively, using the Darcy-Weisbach equation for friction drop plus hydrostatic correction for elevation. For gas pipelines, account for compressibility (Z-factor) using the appropriate EOS.
What KMZ and KML actually contain
KMZ is a Google Earth package—a ZIP archive containing at minimum a doc.kml file and optionally images, overlays, or sub-files. KML is an XML dialect for geographic features. A pipeline centerline is typically stored as a Placemark containing a LineString with a coordinates element:
<Placemark>
<name>Pipeline Route - 16-inch Mainline</name>
<LineString>
<altitudeMode>absolute</altitudeMode>
<coordinates>
-97.4523,31.2847,285.4 -97.4498,31.2851,286.1 -97.4472,31.2856,287.3
<!-- ... hundreds or thousands of vertices ... -->
</coordinates>
</LineString>
</Placemark>Each coordinate triple is longitude, latitude, elevation (in meters if altitudeMode is absolute). The number of vertices depends on how the survey was captured—GPS traces at 1-second intervals on a walking survey can produce thousands of points per mile; aerial or satellite-derived routes may have fewer but more accurate vertices.
Parsing in Python:
import zipfile
import xml.etree.ElementTree as ET
def extract_kml_coordinates(kmz_path: str) -> list[tuple[float, float, float]]:
with zipfile.ZipFile(kmz_path, 'r') as z:
kml_name = next(n for n in z.namelist() if n.endswith('.kml'))
with z.open(kml_name) as f:
tree = ET.parse(f)
ns = {'kml': 'http://www.opengis.net/kml/2.2'}
coords_text = tree.find('.//kml:coordinates', ns).text.strip()
points = []
for triple in coords_text.split():
lon, lat, elev = map(float, triple.split(','))
points.append((lon, lat, elev))
return pointsThis gives you the raw geographic trace. The next step is to give it physical meaning.
Projecting coordinates and computing distances
Geographic coordinates in WGS84 (the default for KML) express positions as degrees of longitude and latitude. Distances computed directly from degree differences are inaccurate because a degree of longitude varies from ~111 km at the equator to zero at the poles. For pipeline engineering, you need distances in meters.
The solution is to project the coordinates into a local Cartesian coordinate system using pyproj:
from pyproj import Transformer
import math
def project_pipeline(
points: list[tuple[float, float, float]],
crs_from: str = "EPSG:4326", # WGS84 geographic
crs_to: str = "EPSG:32614", # UTM Zone 14N for central Texas
) -> list[tuple[float, float, float]]:
transformer = Transformer.from_crs(crs_from, crs_to, always_xy=True)
projected = []
for lon, lat, elev in points:
x, y = transformer.transform(lon, lat)
projected.append((x, y, elev))
return projected
def cumulative_distance(
projected: list[tuple[float, float, float]]
) -> list[float]:
distances = [0.0]
for i in range(1, len(projected)):
x0, y0, z0 = projected[i - 1]
x1, y1, z1 = projected[i]
# 3D distance including elevation change
d = math.sqrt((x1 - x0)**2 + (y1 - y0)**2 + (z1 - z0)**2)
distances.append(distances[-1] + d)
return distancesChoose the UTM zone appropriate for your pipeline's geographic location. For cross-state pipelines, you may need to reproject at zone boundaries or use a single SPCS (State Plane Coordinate System) zone that covers the full extent.
Segmentation strategy
The raw KML trace may have thousands of vertices—far more than you need for a hydraulic model, and potentially fewer than you need if it was coarsely digitized. The segmentation step converts the continuous centerline into a set of discrete pipe segments (nodes and edges) that become the flowsheet.
Hydraulically significant breaks always become node boundaries:
- Diameter or wall-thickness changes (from the pipe specification)
- Pig launcher and receiver locations
- Block valve stations
- Compressor or pump stations
- Meter stations (pressure tap locations where you have field data to compare)
Elevation-driven breaks are added algorithmically. A common rule: if the elevation change within a contiguous span of the centerline exceeds a threshold (e.g., 100 ft) without a previously identified break, insert a node at the peak or valley. This ensures the hydrostatic contribution is accurately captured rather than averaged over a long segment.
After segmentation, each segment has:
- A start node and end node (identified by cumulative distance)
- A horizontal length and a net elevation change (Δz)
- An along-pipe length (the 3D distance, slightly longer than horizontal for hilly terrain)
For a representative 50-mile natural gas gathering pipeline, you might end up with 15–40 hydraulic segments depending on terrain and facility locations.
Assigning pipe specifications and fluid properties
The KMZ file contains only the geographic route—it does not contain pipe size, grade, or fluid composition. These come from the pipeline's as-built drawings, the MAOP documentation, and the operating composition analysis. Typical assignments per segment:
| Parameter | Source | Typical value range | |---|---|---| | Internal diameter | As-built or spec sheet | 4–48 inches | | Pipe roughness | Material spec (commercial steel per ASME B36.10M) | 0.0018 in (0.046 mm) | | Fluid composition | GC analysis from SCADA or custody meter | CH₄ 85–97%, balance C₂–C₆, N₂, CO₂ | | Inlet pressure | SCADA or design basis | 300–1,200 psig | | Ambient temperature | Weather data or seasonal average | 30–95°F depending on region/season | | Heat transfer coefficient | Burial depth, soil type, insulation | 0.2–2.0 BTU/(hr·ft²·°F) for buried |
For the hydraulic calculation, the CalebBell fluids library provides:
- Colebrook-White friction factor (exact, via iterative solution) and the Churchill approximation (explicit)
- Darcy-Weisbach pressure drop per segment
- Joule-Thomson coefficient for gas cooling estimation
from fluids import friction_factor, dP_from_K
Re = (rho * v * D) / mu # Reynolds number
fd = friction_factor(Re=Re, eD=roughness / D) # Darcy friction factor
K = fd * (L / D) # resistance coefficient
dP_friction = dP_from_K(K=K, rho=rho, V=v)
dP_elevation = rho * 9.81 * delta_z # Pa, add for uphill, subtract for downhillFor natural gas pipelines, the compressibility factor Z is required to convert between mass flow and volumetric flow and to compute the gas density at each segment's average pressure and temperature. For pipeline gas (methane-dominant), Peng-Robinson gives Z accurate to within 0.5% at pipeline conditions; GERG-2008 via CoolProp gives < 0.1% if you need higher accuracy (see GERG-2008 vs Peng-Robinson for when this matters).
Building and solving the flowsheet graph
With segments characterized, the pipeline is a directed graph: nodes are pressure/temperature states, edges are pipe segments. For a simple linear pipeline (no loops, one inlet, one outlet), you solve sequentially:
P_node[i+1] = P_node[i] - ΔP_friction[i] - ΔP_elevation[i] - ΔP_acceleration[i]
T_node[i+1] = T_ambient + (T_node[i] - T_ambient) · exp(-UA·L / (ṁ·Cp))
The temperature equation assumes first-order thermal relaxation toward ambient temperature—a reasonable approximation for buried pipelines with a known overall heat transfer coefficient UA.
For looped systems, injection points, or networks (gathering systems with multiple producing wells feeding a common header), the sequential approach breaks down. You need a network solver that simultaneously satisfies continuity (mass balance at each node) and momentum (pressure drop equations on each branch). NetworkX represents the graph structure; the hydraulic solver iterates on node pressures until branch flows converge.
This is the same problem structure as a process simulator's flowsheet convergence, which is why it naturally fits in a tool like Rankine: the pipeline network is just another flowsheet with pipe segments as unit operations.
When this gets harder
Multi-pipeline systems and loops. A gathering system with 20 producing wells and 5 headers is a network, not a linear pipeline. The solver must handle simultaneous mass balances at every junction node. Convergence can be slow or unstable if the initial pressure guess is poor. Good starting conditions matter—initialize with the steady-state analytical solution for the trunk line and iterate outward to the branches.
Compressor stations mid-route. The sequential solve breaks at a compressor station because the compressor adds energy to the stream, resetting the pressure and temperature. Each compressor station requires a separate sub-model (polytropic efficiency, suction and discharge conditions, speed/anti-surge control). The flowsheet must correctly sequence the pressure-rise and temperature-rise calculations across the compressor.
Liquid accumulation in low-lying segments. If the pipeline has significant elevation undulations and the gas is near its dew point (retrograde or rich gas gathering service), liquid can accumulate in sags. The hydraulic model changes: you need a two-phase flow model (Beggs-Brill, mechanistic, or compositional) rather than a single-phase gas model. The KMZ elevation profile is critical input for identifying where sags occur.
Elevation data quality. GPS elevation accuracy from handheld units is typically ±3–5 meters vertical, which can translate to significant errors in hydrostatic pressure calculations for high-relief terrain. USGS 3DEP 1-meter DEM data, available for much of the US, is the preferred supplement or replacement for GPS-derived elevation in the KML file.
Coordinate system edge cases. Long pipelines (hundreds of miles) cross UTM zone boundaries. Processing everything in a single UTM zone introduces distortion at the edges. Solutions: use a custom SPCS, project each segment in its local zone and stitch, or use a conformal projection that minimizes area distortion across the full extent.
How Rankine handles this
Rankine's pipeline import workflow accepts KMZ and KML files directly. You upload the file, Rankine extracts the centerline, projects it to the appropriate UTM zone, and displays the route on a canvas with the elevation profile below. You then apply segment breaks interactively—clicking on the map to add node boundaries at valve stations, compressor locations, or diameter changes. Rankine automatically computes the segment lengths, elevation changes, and cumulative distance table.
From there, you enter the pipe specification (diameter, grade, roughness) and fluid composition, and Rankine solves the hydraulic model using Darcy-Weisbach with the Colebrook-White friction factor and GERG-2008 (or PR, user-selectable) for gas properties. The full node-by-node pressure and temperature profile exports as a structured table for SCADA comparison or regulatory reporting.
For engineers comparing this capability to HYSYS or ProMax pipeline tools, the HYSYS alternatives article covers where each simulator's pipeline module has strengths and gaps. Learn more about Rankine's pipeline modeling at the homepage.
FAQ
What is a KMZ file and why does it matter for pipeline engineering?
KMZ is a compressed archive of one or more KML (Keyhole Markup Language) files, the format Google Earth uses to store geospatial data. Pipeline operators commonly receive KMZ files from surveying firms, GIS departments, or right-of-way databases containing the route centerline as a LineString geometry with coordinates (longitude, latitude, elevation). Extracting that geometry is the first step toward building a segmented hydraulic model because the elevation profile directly drives the hydrostatic pressure component of the pipeline pressure drop calculation.
Can I import a KMZ file directly into HYSYS or ProMax?
No—neither HYSYS nor ProMax accepts KMZ or KML as a direct import format. The standard workflow is to extract the pipeline geometry using a GIS tool or Python library, compute the segment-level elevation changes and pipe lengths, then manually or programmatically construct the pipeline flowsheet in the simulator. This is a data-preparation step, not a simulation step.
What Python libraries are useful for KMZ-to-flowsheet conversion?
The core Python stack: Python's built-in zipfile and xml.etree.ElementTree for KMZ/KML extraction, pyproj for coordinate projection (converting geographic to projected coordinates to get accurate distances), shapely for geometry operations, NetworkX for representing the pipeline graph, and CalebBell's fluids library for hydraulic calculations on each segment. All are MIT or BSD licensed.
How do you handle elevation data in KMZ pipeline files?
KML LineString coordinates include a third value (altitude) in the format longitude,latitude,altitude. If the surveyor populated altitude from GPS or LiDAR, the values are usable after verification. If altitude is zero or missing, you must supplement with a digital elevation model—USGS 3DEP or SRTM—and sample the DEM along the pipeline centerline at your segmentation interval.
What thermodynamic properties do I need to assign to each pipe segment?
Each segment needs: internal diameter and wall thickness, roughness (typically 0.0018 inches for commercial steel per ASME B36.10M), fluid composition from a GC analysis, inlet pressure and temperature, and either a fixed outlet temperature or ambient temperature with insulation specification for thermal calculations.
Further reading
- OGC KML Standard, version 2.3: https://www.ogc.org/standards/kml/
- USGS 3D Elevation Program (3DEP) — 1-meter DEM data for the contiguous US: https://www.usgs.gov/3d-elevation-program
- CalebBell
fluidslibrary — pipe hydraulics, friction factors, Darcy-Weisbach: https://github.com/CalebBell/fluids - pyproj documentation — coordinate reference system transformations: https://pyproj4.github.io/pyproj/
- ASME B31.8-2022, Gas Transmission and Distribution Piping Systems. American Society of Mechanical Engineers. (Standard reference for pipeline hydraulic design requirements.)
- Menon, E. S. (2005). Gas Pipeline Hydraulics. CRC Press. (Practical reference for the Darcy-Weisbach method applied to compressible gas flow.)