A Shapefile is a vector geospatial data format developed by Esri in the early 1990s. Despite its singular name, it is a collection of files that must travel together: at minimum .shp (geometry), .shx (geometry index), and .dbf (attribute table), usually with a .prj (coordinate reference system) and often .cpg (character encoding).
Why it matters
The Shapefile remains the most widely exchanged GIS vector format because nearly every tool reads and writes it. But it is a dated format with hard limits that cause real data loss, so understanding its constraints is essential when delivering or receiving spatial data.
Concrete example
A single Shapefile holds one geometry type (point, line, or polygon, optionally with Z/M values) and one attribute table. Storing a project's faults and geological contacts therefore needs at least two separate Shapefiles, not one layered file as a GeoPackage would allow.
Common pitfalls
- Field name limit: attribute (column) names are truncated to 10 characters in the
.dbf, silently mangling longer names. - 2 GB size cap: both
.shpand.dbfare limited to roughly 2 GB. - Encoding: without a correct
.cpg, accented characters and non-ASCII text break. - Missing sidecar files: a Shapefile shipped without its
.prjloses its CRS, and without.shx/.dbfit is unusable. Always zip all parts together.
For most new work, GeoPackage or GeoJSON avoid these limits while keeping wide compatibility.