This section explains how importing a dataset creates columns in CARTO (and the naming conventions that you should use). It includes how CARTO guesses content during the import process, lists the supported geospatial formats for uploading data, and describes how to upload multilayer datasets or batch file uploads.
When a file is imported, it is transformed into a dataset that can be processed by CARTO. The system automatically creates the following columns:
When a dataset is exported from CARTO, it includes the cartodb_id
and the_geom
columns, which will be reused if the dataset is then imported to the system. This ensures that importing an exported dataset contains the original exported dataset content and row order.
If these columns are generated by the user, the CARTO table requirements must be followed in order to produce a successful import. Otherwise, importing datasets which do not meet the requirements (such as a dataset with duplicated integers in its cartodb_id
column) will result in an import failure.
Apply the following naming conventions for datasets in CARTO, and avoid using the reserved words as part of your file names.
There are certain words reserved in the system that cannot be used to name columns or datasets, mainly the PostgreSQL reserved words. Any names that conflict with a reserved word are prefixed with an underscore (_) automatically.
CARTO includes guessing functionality during the import process. This is useful for when files or data are missing some upload information. The following guessing options are available:
Fields guessing
For files whose format does not include type information (usually CSV files), field guessing options can be enabled. There are two particular guessing options for these type of files:
Type guessing: determines the type of imported columns from the text contents, available in the CSV file. If enabled, it generates numeric and boolean columns when appropriate, otherwise, it uses regular string columns
Quoted fields guessing: when disabled, avoids double quoted fields for type guessing. Otherwise, double quoted fields are used when enabled
Content Guessing
Files that contain country, city, IP address information can be automatically geocoded by the system, if the content guessing option is enabled. This automatic geocoding only occurs if there is not a big proportion of repeated, or null values, in a column. Content guessing does not require the target columns to be named in a special way (such as “country” or “city”), CARTO inspects the different available columns and identifies which of them can be guessed geospatially.
Tip: For information about how to granularly configure the guessing options for your import process, view the upload file parameters on the standard tables section.
CARTO supports several geospatial data formats to upload vector data. The important details of each format, as well as some guidelines to upload your files to CARTO, are defined in this section.
The Shapefile format is a multi-file format — it consists of a set of files with the same name and stored in the same directory, which are differentiated by their extension.
A Shapefile has to be formed, at least, by a .shp file, a .shx file, a .prj file and a .dbf file. These files contain the geometry data, the indexes, the projection information and the attributes, respectively. Other auxiliary files are not mandatory and contain extra information for the Shapefile. Shapefiles must be imported as a single compressed file, in the .zip or .gz format.
Note: The Shapefile format has certain limitations that can affect the way that your datasets are exported/imported into CARTO:
The KML format is a XML based format which adds to it a geographical meaning by being able to define features such as points, polygons or lines in the EPSG 4326 projection.
KML uses common XML types such as string, boolean, double, or int, so your column types will be respected when your dataset is imported or exported from CARTO.
Each feature is defined as a Placemark element, which usually contains a name, a description, and the geometry itself. If more data columns are required, these fields need to be defined and included inside a ExtendedData element of the KML document.
In terms of geometric elements, the Point, Polygon, Line, MultiGeometry and Geometry elements are supported. Different geometry types in the same layer are not supported.
A Keyhole Markup language Zipped (KMZ) file corresponds to a compressed file, including a KML file and zero, or more, supporting files (images, icons, overlays or other elements referenced in the KML file). See the Keyhole Markup Language (KML) section for more information.
The GeoJSON format is an extension of the JavaScript Object Notation (JSON) that encodes geographical features and their metadata. This format supports data types such as string, double or boolean. Dates exported as GeoJSON are stored as strings and will be recognized as such, on data imports.
With respect to geometries, Points, (Multi)Polygons and (Multi)Lines are supported. GeometryCollection geometric objects are not supported and will raise an import error. The supported geometries can be imported inside FeatureCollection and Feature objects.
Importing different geometry types in a FeatureCollection element is not supported.
Comma-Separated Values (or TSV, Tab-Separated Values) files can be imported to CARTO. For a successful import, follow these formatting guidelines:
1
2
name, description, score
"John Doe", "Awesome, the best player ever", 100
1
2
name, geojson
"Null Island", "{""type"": ""Point"", ""coordinates"": [0,0]}"
As the CSV format does not specify the type of the columns in the data, CARTO applies a guessing functionality that converts your data to columns, using a supported format. This enables you to generate numeric columns, or geocode your dataset directly on import.
There are two particular guessing options for CSV files: types guessing and quoted fields guessing. View the Import Guessing section for details.
Excel files, or other spreadsheets (such as OpenDocument spreadsheets or Google Drive spreadsheets) are supported by CARTO.
The format of the uploaded Spreadsheet must apply the following format:
For multi-sheet spreadsheets, only the first sheet will be imported.
The GPX (GPS Exchange Format) files are XML documents that contain waypoints, tracks and/or routes. When importing a GPX file, CARTO will generate different datasets for points, tracks and waypoints. The resulting names of these datasets will be a combination of the GPX name and their type: _track_points
, _tracks
, and _waypoints
, respectively.
CARTO supports importing Open Street Map dumps (.osm files). These files are XML documents that have a osm
parent element that can contain blocks of nodes, ways, or relations representing points, lines or polygons. CARTO will automatically separate OSM dumps into different tables, depending on the geometry. Therefore, importing a single OSM file can lead to more than one resulting dataset.
The MapInfo file format is geospatial vector data developed by MapInfo, which supports grids based multiple files. MapInfo files (.DAT, .ID, .MAP, .TAB) must be imported as a single compressed file, in the .zip or .gz format.
CARTO files are CARTO generated map visualization files. This .carto file includes the dataset and visualization definition, which contains any SQL queries, CartoCSS, basemaps, attributions, metadata, and styling that was applied to a map. This is useful for downloading complete CARTO visualizations that you can share or import.
Several of the formats supported by CARTO can store different layers, or geometric types, by definition. Importing a file that contains more than one layer result in different imported datasets.
If the option create_vis
is enabled in the import process, the different layers imported will be added to the created map. The number of layers that can be included in a map depends on the maximum value of layers per map in the configuration of the user.
The maximum number of datasets created from a multilayer file is 10. If the imported file contains more than 10 layers, those layers are omitted.
The different layers included in a Shapefile are imported as independent datasets.
KML files generate a different dataset, per each Folder, that they contain.
GPX files that contain more than one type of elements (waypoints, tracks, and/or routes) are imported in a different dataset, per type.
OSM files generate a different layer, per each type of geometry that their nodes, ways, or relations represent (points, polygons or lines).
You can perform a batch file upload if the files are sent to the server in a compressed format. As with the case of multilayer uploads, if the import process is configured to generate a map after import, the different datasets are added as layers to the new map. The number of layers that can be included in a map depends on the maximum value of layers allotted to the users account.
The maximum number of files that can be imported in a single file is 10. If the compressed file contains more than 10 files, only the first 10 files are imported and the rest of the files are omitted.