Thursday, December 28, 2017

Meteo #27 - Making Diurnal Variation Plot with GrADS

Making diurnal variation (DV) plot with GrADS is quite easy, especially if you have already known how to make Hovmoller diagram. The two plots are almost similar, with DV utilizes time-time axis instead of longitude-time or latitude-time pairs as Hovmoller. Furthermore, in order to make a DV with GrADS, one should first creates a binary file with modified X-Y grid which represents the time axis. A detail guide for making a GrADS gridded binary file could be read here

1. Creating GrADS binary file

A DV plot is basically a modified time series plot. If a time series runs along the full time period, then the DV divides the full time period (e.g. 1 month) into smaller 24-hour time periods hence giving the name 'diurnal' or daily plot. Since the data source is same, DV could use the same data set as time series, with the exception for the file writing method into binary file. 

'reinit'
'open aerosol.ctl'
'set fwrite aerosol_ts.dat'
'set gxout fwrite'
'set x 1'
'set y 1'
'set z 1'
'set t 1 744'
'd tloop(aave(dustload5,lon=76.2,lon=78.2,lat=27.6,lat=29.6))'
'd tloop(aave(msa,lon=76.2,lon=78.2,lat=27.6,lat=29.6))'
'd tloop(aave(dms,lon=76.2,lon=78.2,lat=27.6,lat=29.6))'
'd tloop(aave(pm25,lon=76.2,lon=78.2,lat=27.6,lat=29.6))'
'd tloop(aave(so2,lon=76.2,lon=78.2,lat=27.6,lat=29.6))'
'disable fwrite'

The above script will make a 744-hour (1 month) time series data of 5 variables into a binary file. Notice that the script will save the data with sequential order on time as the X-grid in the binary file. In other words, we change the time dimension into space dimension (T-grid into X-grid). This file writing method is different from common time series plot data.

2. Making GrADS CTL file

The CTL file for DV plots is basically same with time series, with the only difference on the dimension definition. The main trick to make a DV plot is to modify 1-dimension data, in this case, X-grid with 744 grid numbers, into 2-dimension data: X and Y-grids, with 24 and 31 grid numbers respectively.

DSET ^aerosol_ts.dat
TITLE This is experimental
UNDEF 99999.0
XDEF 24 LINEAR 1 1
YDEF 31 LINEAR 1 1
ZDEF 1 LINEAR 1 1
TDEF 1 LINEAR 00Z01MAY2017 1mo
VARS 5
dustload5 0 99 Total dustload
msa 0 99 MSA
dms 0 99 DMS
pm25 0 99 PM2.5 Aerosol
so2 0 99 Sulphur Dioxide
ENDVARS

Notice that instead of full time period (744 hours or grids) which was written as X-grid in the binary file, the XDEF contains 24 grids (which represents 24-hour), YDEF contains 31 grids (which represents the day/date) and TDEF only contains 1 grid. This will make GrADS 'thinks' that the data file is a 2-D data (monthly) instead of 1-D data (hourly). Save the file with any name e.g. aerosol_ts.ctl.

3. Displaying the DV plot

Once the CTL file is saved, just open the data as usual. Since the plot uses X and Y-grids which are originally used for real-world coordinates, don't forget to set the map draw off. We can use contour, shaded-contour or filled-grid to display the plot.

ga-> open aerosol_ts.ctl
ga-> set mpdraw off
ga-> set gxout grfill
ga-> d dustload5 

This is the result looks like (filled-grid with contour):


Thursday, December 21, 2017

Meteo #26 - Saving Multi-variable Data into GrADS-gridded Binary File with GrADS

The title might be a little bit confusing, but what I want to share in this post is about how to save data, say, from a certain data format into another one which has GrADS format. For example, you open a NetCDF (nc) data with 5 variables in it with GrADS, and you want to save them into GrADS binary file for further analysis with the tool. 

Again, why use GrADS to do such task when you can do it by programming for example FORTRAN, C or others? Well, that's because it's efficient to do the task with GrADS only. You don't need to write a program, compile it or debug it over and over again, hence saving much of your precious time. Anyway, before doing so, one should understand the grids order in the GrADS-gridded binary file.

GrADS-gridded binary file is actually an ordinary binary file with a certain order. It doesn't have header/metadata to describe its dimensions or variables to the tool or user who wants to read it. That's why you need a descriptor or control file (CTL) in order to open the data with GrADS. In other words, you'll need to understand how the data is ordered in the file to differentiate dimensions and variables before trying to read it.

Imagine a 1-D data of a time series, for example, hourly air temperature (T) for 9 hours. The data will have 9 records which represent the time (hour), with their own values which represent the T at each time.

29, 30, 30, 31, 30, 29, 30, 31, 30

By looking at the data, we knew that the T for the 1st hour is 29, then 30 for the 2nd, and so on. That's exactly how the data stored in a binary file. It doesn't contain any information about the exact time (in real world), but we knew that the first record is the value of T at the 1st hour because it's already explained in previous paragraph that the data is a 1-D time series of T for 9 hours. If the data is just shown 'raw' as it is without any description, nobody will know what kind of information it contains since 29 or 30 could mean anything other than temperature (e.g. age, or number of apples on the tree, etc.). 

So, the key word here is 'description' about the data, which gives information to the user to interpret its contents. 

Then, how about if the description says that it's not a 1-D data, but rather a 2-D (spatial) data of air temperature at a time, for example 12 AM? Let say, the records are interpreted with gridded or matrix structure for real world coordinates like this (remember, it's still the same record as before):

30 31 30
31 30 29
29 30 30

Then we knew that the 1st three records in the data (29, 30 and 30) is located at the lowest row, while the first members of each three records (29, 31 and 30) are located at the leftmost column. If we give each record an x-y coordinate, the grid should be like this:

29 (y1,x1), 30 (y1,x2), 30 (y1,x3), ... , 30 (y3,x1), 31 (y3,x2), 30 (y3,x3)

It's clear from those two examples that a binary data file is merely a sequential blocks of data. 1-D, 2-D or even 5-D data will always be treated with sequential order by computer. What makes them different to each other is the 'description' which explain the 'rules' of the order sequence of data in the file. From previous example, we knew that, even the records is the same, it will have different interpretation based on the description about its contents. For 1-D data, all records is interpreted as values with 9 time stamps, while for 2-D data, the records follow matrix structure to give each values x-y coordinates with only 1 time stamp. To make it simple, the order is like this:

1-D data ---> Time, Value
2-D data ---> Time, y-value, x-value

Back to GrADS, the tool also has certain rules to treat gridded-binary data, and a user needs to follow such rules in order to make GrADS save or read data into its binary format. GrADS can save/read up to 5-D gridded data with the following order:

Ensemble, Time, Variable, z, y, x

If you want, for example, saving 2-D (e.g. 2x2 grid) data with 2 different variables for 2 hours to GrADS-gridded binary format. 

Variable 1 at hour 1:  A (y1,x1), B (y1,x2), C (y2,x1), D (y2,x2)
Variable 1 at hour 2:  E (y1,x1), F (y1,x2), G (y2,x1), H (y2,x2)

Variable 2 at hour 1: I (y1,x1), J (y1,x2), K (y2,x1), L (y2,x2)
Variable 2 at hour 2: M (y1,x1), N (y1,x2), O (y2,x1), P (y2,x2)

then you should save the data with this order: 

[Hour 1: Variable 1 : y1,x1,y1,x2,y2,x,1,y2,x2], [Hour 1: Variable 2 : y1,x1,y1,x2,y2,x,1,y2,x2], [Hour 2: Variable 1 : y1,x1,y1,x2,y2,x,1,y2,x2], [Hour 2: Variable 2 : y1,x1,y1,x2,y2,x,1,y2,x2]

As the result, the binary file contents will have order like this (with values):

A, B, C, D, I, J, K, L, E, F, G, H, M, N, O, P

It might be confusing at the beginning, but once you understand the pattern, everything will make sense and pretty easy to follow.

Here's an example GrADS script for saving time series (1-D, 744 hours) data of 5 variables into a binary file:

'reinit'
'open aerosol.ctl'
'set fwrite aerosol_ts.dat'
'set gxout fwrite'
'set x 1'
'set y 1'
'set z 1'
timer=1
while(timer<=744)
 say 'writing fields to file on t: ' timer
 'set t 'timer
 'd tloop(aave(dustload5,lon=76.2,lon=78.2,lat=27.6,lat=29.6))'
 'd tloop(aave(msa,lon=76.2,lon=78.2,lat=27.6,lat=29.6))'
 'd tloop(aave(dms,lon=76.2,lon=78.2,lat=27.6,lat=29.6))'
 'd tloop(aave(pm25,lon=76.2,lon=78.2,lat=27.6,lat=29.6))'
 'd tloop(aave(so2,lon=76.2,lon=78.2,lat=27.6,lat=29.6))'
 timer=timer+1
endwhile
'disable fwrite'

Notice that all variable (fields) needs to be written into the file before moving to the next time stamp.

In order to open the binary file created from the script, you should follow the variable order again in the CTL file to make GrADS understand it. If you mess with the order, you still can read it by GrADS, but the results might be strange and confusing (e.g. dms may be interpreted as msa by GrADS). Here's the example CTL file to open the previously made binary file:

DSET ^aerosol_ts.dat
TITLE This is experimental
UNDEF 99999.0
XDEF 1 LINEAR 1 1
YDEF 1 LINEAR 1 1
ZDEF 1 LINEAR 1 1
TDEF 744 LINEAR 00Z01MAY2017 1hr
VARS 5
dustload5 0 99 Total dustload
msa 0 99 MSA
dms 0 99 DMS
pm25 0 99 PM2.5 Aerosol
so2 0 99 Sulphur Dioxide
ENDVARS

If you display the result in GrADS, it may look like this (e.g. variable dustload5, with few 'cosmetics' for display):


  

Wednesday, December 20, 2017

Meteo #25 - Opening NetCDF Data With GrADS Control File

NetCDF (or 'nc') is one of the most popular data formats in Geoscience universe, thus no wonder so many kinds of global dataset are distributed in nc format, including climate and meteorological data. While most of it could be easily opened and utilized by various data processing software, sometimes .. yes, sometimes s**t happens. This time, I would like to share how to open an nc data by using GrADS descriptor/control file (CTL).

Firstly, one may ask, why do you use GrADS to open an nc data? For more experienced people, the real question is probably: why the hell do you use GrADS CTL to open an nc??

Here are few reasons why use GrADS to open an nc:
  1. Built-in nc library. That means, users don't need to go through the troublesome NetCDF installation, just to open an nc data. What you need is just installing GrADS or OpenGrADS which is relatively much easier to do. Pre-installed nc binaries and libraries though, are recommended for some reasons which will be explain later in this post.
  2. Light, fast and free. This is an all-time classic reason. And yes, of course you can open an nc file using Matlab, ArcGIS or any sophisticated software but they are mostly resource-hogs and expensive.
  3. Efficient. Yup, you can also write a program (e.g. with FORTRAN) to open an nc file. However, using GrADS will save much of your time because you don't need to write, compile, or debug program as you do with programming languages.
Then why use GrADS CTL file to open an nc while you still can easily open it using GrADS 'sdfopen' or even 'xdfopen' commands?

The funny thing is, not all nc data could be open using above commands. You may sometimes encounter this annoying error message:

gadsdf: SDF file has no discernable X coordinate

It happens mostly because the nc data doesn't conform to the COARDS conventions (just Google, in case you've never heard of it). A 'good' nc data normally has a header or metadata section (hence called SDF or Self-Describing File) which contains complete information about the dataset, for example: dimensions (x,y,z,t), variables etc. GrADS 'sdfopen' or 'xdfopen' commands basically need to read that header in order to open an nc data, and when they found any incomplete information in a 'bad' nc data, that annoying message appears. While most of nc files follow COARDS conventions for storing the data, some of them may probably not, and that's exactly the reason why GrADS CTL file is very useful.

The catch is, instead of nc original header, the GrADS CTL file will be used as the 'replacement' header for opening the nc data. While it requires more efforts than just using 'sdfopen' or 'xdfopen', the user will have more control to the data by overriding the original descriptor.

Before opening an nc with GrADS CTL, a user needs to know two important information from the data:
  1. Dimensions. Since an nc data has gridded format, the user at least needs to know the grid numbers in x, y, z direction, as well as t (time) or e (ensemble), if they are needed. 
  2. Variable names and their order/structure. Since gridded data uses matrix-like sequential order to differentiate dimensions and variables, the user should know how the data is ordered in the file.
If you've already known those two information, that will be great. Otherwise, you'll need to use a tool in order to get information of nc's dimensions and variables. One of the most well-known tools to do such work is ncdump, which will be installed automatically when you install NetCDF on your system. That's why it's recommended to have pre-installed NetCDF binaries and libraries before you work with such data files.

Ok. Assume you don't know anything about the nc file you would like to open, here are the steps to open an nc data using GrADS CTL. For this example, I use an nc data which was an output from WRF-CHEM model simulation on Linux OS, with pre-installed NetCDF, and OpenGrADS ver 2.1.0.

1. Getting Header Information 

On the linux shell, make a symbolic link to the nc file you'd like to open. This is just to make things easier and not an obligatory though, thus you can omit this step if you want. For this example, I make a symbolic link of an nc file (wrfout_d01_2017-05-01_00:00:00) through a file named testnc, because the original file name is too long.

$ ln -sf wrfout_d01_2017-05-01_00:00:00 testnc

Execute ncdump to get the header of the nc file (or the link file).

$ ncdump -h testnc

Once the header is shown, scroll to the uppermost part of it. You may find something like this:

       dimensions:
        Time = UNLIMITED ; // (393 currently)
        DateStrLen = 19 ;
        west_east = 99 ;
        south_north = 109 ;
        bottom_top = 29 ;
        bottom_top_stag = 30 ;
        soil_layers_stag = 4 ;
        west_east_stag = 100 ;
        south_north_stag = 110 ;
        dust_erosion_dimension = 3 ;
        klevs_for_dust = 1 ;
        bio_emissions_dimension_stag = 41 ;
        klevs_for_dvel = 1 ;
        vprm_vgcls = 8 ;

Those are the dimensions we need to know. From the example, we found that the time dimension is 393, grid number in x direction is 99 (west_east), y direction is 109 (south_north) and z direction is 29 (bottom_top). The dimension header could be different for any nc file, thus you should at least know some information about the nc data before you open it. Otherwise, you could use your common sense to guess the dimensions, e.g. x usually related to west-east or longitude direction etc.

Next, scroll down the header to get information about the variables you would like to access. For this example, I would like to access a variable named BC2 (Hydrophilic Black Carbon). It may look like this:

         float BC2(Time, bottom_top, south_north, west_east) ;
                BC2:FieldType = 104 ;
                BC2:MemoryOrder = "XYZ" ;
                BC2:description = "Hydrophilic Black Carbon" ;
                BC2:units = "ug/kg-dryair" ;
                BC2:stagger = "" ;
                BC2:coordinates = "XLONG XLAT XTIME" ;

Pay attention to the top part of variable header. It shows the grid order of the variable: Time, bottom_top, south_north, west_east. In accordance with the dimensions, the grid order will be like this: t, z, y, x. You may want to make notes about each variables and their grid orders because you will need to put them into the GrADS CTL file later. Anyway, you don't need to list all variables in the nc data if you only want to access specific variables only (e.g. 5 out of 100 variables).

2. Making GrADS CTL File

Create a new CTL file with your favorite text editor (notepad, vi, vim, gedit etc.). The contents of CTL file to open an nc are almost no different with normal CTL files. For my example, it looks like this:

DSET ^testnc
TITLE This is experimental
DTYPE netcdf
UNDEF 99999.0
XDEF 99 LINEAR 1 1
YDEF 109 LINEAR 1 1
ZDEF 29 LINEAR 1 1
TDEF 393 LINEAR 00Z01MAY2017 1hr
VARS 5
DUSTLOAD_5=>dustload5 0 t,y,x Total dust loading
BC1=>bc1 29 t,z,y,x Hydrophobic Black Carbon (ug/kg-dryair)
BC2=>bc2 29 t,z,y,x Hydrophilic Black Carbon (ug/kg-dryair)
OC1=>oc1 29 t,z,y,x Hydrophobic Organic Carbon (ug/kg-dryair)
OC2=>oc2 29 t,z,y,x Hydrophilic Organic Carbon (ug/kg-dryair)
ENDVARS

DSET indicates the path of nc file (or its link) you would like to open. TITLE is just a title, you can write anything. DTYPE indicates the file format, you should put 'netcdf' for it. UNDEF indicates undefined value for each variable, if you don't know, just put -99.9e8 or any 'extreme' value as you like.

XDEF, YDEF, ZDEF and TDEF indicate the nc file dimensions you get from ncdump, as well as the first grid and its spacing in world coordinates. If you don't know the first grid or grid spacing of space dimensions (XDEF, YDEF and ZDEF), just put LINEAR 1 1 behind each entries. Otherwise, you should put the first grid coordinates and their spacing for each entries in world coordinates. For example:

XDEF 99 LINEAR 67.732000 0.311591836734694
   
For TDEF, you should define the first time stamp and time interval of each data. For my example, the first data is at 00 UTC of  01MAY2017, with 1 hour interval.

VARS indicates the number of variables you'd like to access in the data. In my example, I would like to access only 5 out of hundreds variables in the nc file.

Next entries, are the ones which make this nc CTL different from the normal GrADS CTL. You should list the variables you want to access with this syntax:

[VAR_NAME_IN_NC]=>[VAR_NAME_IN_GrADS] [NUMBER_OF_Z_LEVELS] [GRID_ORDER] [VAR_DESCRIPTION]

For example, I previously would like to access Hydrophilic Black Carbon in the nc data. You should firstly list its original name (which is BC2 in the nc data), then put '=>' sign before putting the variable name in GrADS, which could be any name you like (for my case, bc2). You can even put the same name for the GrADS variable name if you want.

Next, you should define the number of Z levels for the variable. Since BC2 is 4D variable data (with x,y,z, and t), the value should be 29 which is the same with Z dimension of the data. For other cases, 3D data for example (in this case, DUSTLOAD_5), which only has data with a single Z level, the value should be 0.

Finally, you should put the grid order as what you've found in the ncdump result (see first steps). For example, the grid order for variable BC2 from ncdump is: time, bottom_top, south_north, west_east. Then you should put it like this: t,z,y,x. Hence, in the end, it should be looked like this:

BC2=>bc2 29 t,z,y,x Hydrophilic Black Carbon (ug/kg-dryair)
DUSTLOAD_5=>dustload5 0 t,y,x Total dust loading

The rest (variable description) is free, you can write anything you like to describe the variable. Don't forget to put ENDVARS at the end of the file.

Save the CTL file with any name you like, for example: wrfout1.ctl.

3. Opening The NC Data

This is the last step and it's definitely no different with the normal way to open binary data with GrADS.

ga-> open wrfout1.ctl
Scanning description file:  wrfout1.ctl
Data file wrfout1 is open as file 1
LON set to 67.732 98.268
LAT set to 6.049 36.867
LEV set to 1 1
Time values set: 2017:5:1:0 2017:5:1:0
E set to 1 1
ga-> set t 100
Time values set: 2017:5:5:3 2017:5:5:3
ga-> d bc2
Contouring: 0.001 to 0.014 interval 0.001

And here's the result of my example (with x and y using real world coordinates):



From this point on, you can do anything with the nc file. You can add as many variable as you like in the CTL file or even save the variable data into a binary file for any further uses :-)