T h e P a n g e a M o d e l 
> Description > Extended  [login] 
This section provides an extended introduction/description about the main mechnisms invovled in Pangea.
The general flow of operations in Pangea is represented in Fig.1. We start with nongridded georeferenced data that characterize the natural system (loosely defined as "the world"). Using its GIS engine [1], Pangea builds a set of grids that cover all relevant media, hence defining the geometrical system, and projects georeferenced data in this system. Gridded data and geometric and topological parameters are then reindexed [2] into a mathematical compartmental system called virtual system. This system is solved [3] (for e.g. environmental concentrations at steadystate), further computations are performed (e.g. population exposure), and the solution is projected back onto relevant grids [4] for analysis/visualization.
Goverining equations are simples; the virtual system is a linear system of first order ordinary differential equations (with timedependent or constant coefficients) of dimension \(n^v\) :
\[\frac{\mathrm{d}m(t)}{dt} = \mathbf{K}(t)\:m(t) + s(t) \quad \text{or as simple as} \quad \frac{\mathrm{d}m(t)}{dt} = \mathbf{K}\:m(t) + s\] where \(m(t)\in\mathbb{M}_{n^v \times 1}\) is a vector of masses of pollutant, \(s(t)\in\mathbb{M}_{n^v \times 1}\) is a vector of emissions, and \(\mathbf{K(t)}\in\mathbb{M}_{n^v \times n^v}\) is a matrix of transfer rate coefficients.
The complexity lies in the construction of this system: how to go from a set of georeferenced data to a global system of 3D multiscale grids covering all relevant media and whose cells delineate inhomogeneous content (e.g. terrestrial grid cells contain all relevant terrestrial media), and then to a mathematical compartmental system the describes the evolution of a set of homogeneous compartments.
The model structure (functional blocks) is represented in Fig.2. While Pangea v1 was about half MATLAB and half Python/ArcGIS, the midterm goal with v2 is to propose a MATLAB only model. The main reason is that it is difficult to maintain a model that invloves multiple language, and difficult for users to understand what must be done, when and where. Version 1 was therefore difficult to analyze and debug for both developers and users. Fig.2 shows a Core, a reindexing engine, a computation engine, a set of environmental models, a GIS engine, and a set of more general models (atmospheric, hydrological, etc). While this figure is a bit outdatted (the model was recently restructured), it shows most constitutive blocks of the model.
In practice, Pangea comes as a bundle with:
The core of the model is a MATLAB package named Pangea (folder [pangearoot]/Model/[version,revision]/+Pangea). A project is a MATLAB MFiles which builds an instance of the Pangea.Model class and performs a series of operations for setting up all relevant components of the project. Multiple approaches are possible, ranging from the creation of a predefined type of "easy" project (e.g. single point source with default parameters) such as below:
%  Initialize and load model. addpath( '..\..\Model\Current' ) model = Pangea.Model() %  Create point source ezProject, add pollutant and source. % Source @ lon = 4, lat = 49, alt = 10m, to air, intensity = 1EMU/day. project = model.addEzProject( 'PointSource' ) ; pollutant = project.addPollutant( '71432', 'Benzene' ) ; project.addPointSource( 4, 49, 10, 'air', pollutant, 1 ) ; %  Run and export. project.run() ; project.results.export() ;
.. to MFiles that contain several hundreds of lines for defining specifically every relevant parameter.
Pangea comes with a default set of grids and media. Grids are built at runtime and hence projectspecific. This allows advanced users to build their own grids. The set of media is intimately bound to the set of EPMs; yet, advanced users can introduce new media if they provide the EPMs that describe/model pollutant fate and transport with these media and between these media and other media.
Background grid  :  single layer, low resolution, e.g. 40x20 cells over whole world. 
Results grid  :  single layer, multiscale; this is the grid onto which are projected all the results ultimately. 
Atmospheric grid  :  17 layers, 3D multiscale. 
Terrestrial grid  :  single layer, clustered, weakly multiscale. 
Sediments grid  :  piggyback of terrestrial grid, limited to cells which contain fresh water. 
Sea/ocean grid  :  single cell, more complex version under dev. (with very priority), complexity comes from coastal zones. 
Air  :  ... 
Fresh water  :  ... 
Sediments  :  ... 
Natural lands  :  ... 
Agricultural lands  :  ... 
Sea water  :  ... 
The refinement potential (RP) is a practical solution for specifying constraints on the resolution, as a basis for building multiscale grids. It is by definition a global scalar field whose value in each point of the globe defines the user "interest for high resolution". In practice, it is a raster that results from multiple, weighted contributions, e.g.: population density, distance from source(s), etc.
Fig.3 provides an example. A background grid with low resolution is represented on the left. The center respresents a RP obtained as a weighted sum between components made of a raster of population counts, a step function of the distance from a source that is a the center of the disks, and a flag " in land". The figure on the right is the refined multiscale grid obtained after an iterative refinement procedure.
The iterative procedure is depicted in another context (RP = global raster of population counts) in Fig.4 : depth 0 is the background grid; the potential is integrated on this grid (providing a summary (SUM of pixels in practice) per cell), and all cells whose summary value is above a userdefined threshold are refined (e.g. split in 4). This defines a new grid; the potential is integrated over this new grid, and so on..
The Background grid is a userdefined regular/rectangular grid in the space of longitudes and latitudes that covers the globe (e.g. 40x20 cells). It is the starting point of the refinement procedure.
The Results grid is a multiscale grid obtained through the refinement procedure explained above, which is the grid onto which all results are projected ultimately. It is userdefined and project specific, built with a high resolution as specified by user through the definition of the refinement potential and low(er) resolution elsewhere. Pangea uses this grid (as well as all intermediary grids obtained during the refinement procedure) e.g. in its procedure for buildinging the default atmospheric grid, where it defines the first atmospheric layer. Projecting all results onto this grid at the end is required for comparison: environmental concentrations of pollutant in the air and in fresh water could not be compared if they were expressed on different geometries.
The default atmospheric grid is made of 17 layers based on grids obtained during the refinement process described in the previous section. The first layer is the Results grid. The second layer is the grid defined at the last intermediary step of the refinement, and so on until we reach the background grid. All layers above are then defined using the background grid. In Pangea v1.x, layers' altitudes where defined to match GEOSChem sigma surfaces. Pangea v2 currently implements (as a prototype) an atmospheric model which interpolates in 3D, making it possible to specify userdefined altitudes. 

The default terrestrial grid is based on the WWDRII 0.5°\(\times\)0.5° fresh water model. A new version (in development) is based on the HydroBASINS model/database, and will be available sometime in 2016. In any case, the hydrology defines the terrestrial grid geometry of watersheds.
The current version aggregates the roughly 65'000 cells of the WWDRII model into clusters that delineate watersheds. The reason for clustering is that it reduces significantly the size of the numerical system. Fig.6 shows the WWDRII grid, which defines a geometry with maximal resolution.
Yet, cells delineate regions with inhomogeneous content, as shown in Fig.7, which provides the land cover type as defined by the GlobCover data set. Each cell (geometry/geography) can potentially lead to more than 20 compartments (GlobCover has 22(3) cover types) that represent homogeneous content in the virtual system. By default, Pangea is aggregating these 22 types into 3 or 4, which reduces the system size, but it would still be extremely resource intensive not to cluster.
The current clustering algorithm is based on stems from streams end points and time travel to the ocean. It is weakly constraint by the refinement potential. A clustered terrestrial grid is shown in Fig.8; it is then usually unclustered over regions of interest, as show in Fig.9.
The sediments grid is a piggyback of the terrestrial grid, reduced to cells with a nonnull fresh water content.