Content of this section:
- Data Structures
- Modules (1): GDAC, MITprof, GriddedFields
- Modules (2): MITprofAnalysis, MITprofStat
Data Structures
ArgoData.ProfileNative — TypeProfileNative
Container for a multivariate profile read from a GDAC Argo file.
- 1D arrays: lon,lat,date,ymd,hms,pnumtxt,direc,DATAMODE,isBAD
- 2D arrays: T,S,pressure,depth,T_ERR,SERR
ArgoData.ProfileStandard — TypeProfileNative
Container for a multivariate profile in MITprof format.
- 2D arrays: T,S,Testim,Sestim,Tweight,Sweight,Ttest,Stest,T_ERR,SERR
ArgoData.MITprofStandard — TypeMITprofStandard
Container for a MITprof format file data.
- filename : file name
- 1D arrays: lon,lat,date,depth,ID
- 2D arrays: T,S,Te,Se,Tw,Sw
ArgoData.Argo_pq — TypeBase.@kwdef struct Argo_pq
Dataset ::Parquet2.Dataset
schema ::Tables.Schema
files ::Vector
folder ::String
endModule: GDAC
ArgoData.GDAC.download_file — Functiondownload_file(file::DataFrameRow,suff="prof",ftp=missing)Get folder and wmo from data frame row and them call download_file.
using ArgoData
files_list=GDAC.files_list()
file=GDAC.download_file(files_list[10000,:])ArgoData.GDAC.download_file — Functiondownload_file(folder::String,wmo::Int,suff="prof",ftp="https://data-argo.ifremer.fr/dac/")Download Argo file from the GDAC server (<https://data-argo.ifremer.fr/dac/> by default) to a temporary folder (joinpath(tempdir(),"Argo_DAC_files"))
The file name is given by folder and wmo. For example, 13857_prof.nc for wmo=13857 is from the aoml folder.
The default for suff is "prof" which means we'll download the file that contains the profile data. Other possible choices are "meta", "Rtraj", "tech".
If the ftp argument is omitted or isa(ftp,String) then Downloads.download is used. If, alternatively, isa(ftp,FTP) then FTPClient.download is used.
Example :
using ArgoData
GDAC.download_file("aoml",13857)
#or:
ftp="ftp://usgodae.org/pub/outgoing/argo/dac/"
#ftp="ftp://ftp.ifremer.fr/ifremer/argo/dac/"
GDAC.download_file("aoml",13857,"meta",ftp)ArgoData.GDAC.files_list — Methodfiles_list(fil::String)Get list of Argo float files from csv file with columns two columns – folder and wmo.
using ArgoData
fil="https://raw.githubusercontent.com/euroargodev/ArgoData.jl/gh-pages/dev/Argo_float_files.csv"
files_list=GDAC.files_list(fil)ArgoData.GDAC.files_list — Methodfiles_list()Get list of Argo float files from Ifremer GDAC server ftp://ftp.ifremer.fr/ifremer/argo/dac/
using ArgoData
files_list=GDAC.files_list()ArgoData.GDAC.grey_list — Methodgrey_list(fil::String)Read "ar_greylist.txt" file into a DataFrame.
ArgoData.GDAC.grey_list — Methodgrey_list()Download "ar_greylist.txt" from GDAC and read file into a DataFrame.
Module: MITprof
ArgoData.MITprof.format — Functionformat(meta,gridded_fields,input_file,output_file="")From Argo file name as input : read input file content, process into the MITprof format, and write to MITprof file.
MITprof.format(meta,gridded_fields,input_file)ArgoData.MITprof.format — Methodformat(gridded_fields,input_file)From Argo file name as input : read input file content, process into the MITprof format, and write to MITprof file.
MITprof.format(gridded_fields,input_file)ArgoData.MITprof.format_loop — Methodformat_loop(II)Loop over files and call format.
gridded_fields=GriddedFields.load()
fil=joinpath(tempdir(),"Argo_MITprof_files","input","Argo_float_files.csv")
files_list=GDAC.files_list(fil)
MITprof.format_loop(gridded_fields,files_list,1:10)ArgoData.MITprof.write — MethodMITprof.write(meta,profiles,profiles_std;path="")Create an MITprof file from meta data + profiles during MITprof.format.
MITprof.write(meta,profiles,profiles_std)ArgoData.MITprof.write — Methodwrite(fil::String,mp::MITprofStandard)Create an MITprof file from an MITprofStandard input.
ArgoData.MITprof.write — Methodwrite(fil::String,mps::Vector{MITprofStandard})Create an MITprof file from a vector of MITprofStandard inputs.
Module: GriddedFields
ArgoData.GriddedFields.interp! — Methodinterp!(T_in::MeshArray,Γ,mp::MITprofStandard,📚,T_out)
interp!(T_in::MeshArray,Γ,mp::MITprofStandard,T_out)Interpolate T_in, defined on grid Γ, to locations speficied in mp and store the result in array T_out.
Providing interpolation coefficients 📚 computed beforehand speeds up repeated calls.
Example:
fil=glob("*_MITprof.nc","MITprof")[1000]
mp=MITprofStandard(fil)
(f,i,j,w)=InterpolationFactors(G.Γ,mp.lon[:],mp.lat[:]);
📚=(f=f,i=i,j=j,w=w)
T=[similar(mp.T) for i in 1:12]
[interp!(G.T[i],G.Γ,mp,📚,T[i]) for i in 1:12]In the above example, [interp!(G.T[i],G.Γ,mp,T[i]) for i in 1:12] would be much slower.
ArgoData.GriddedFields.interp — Methodinterp(T_in::MeshArray,Γ,mp::MITprofStandard)Interpolate T_in, defined on grid Γ, to locations speficied in mp.
For a more efficient, in place, option see interp!.
fil="MITprof/1901238_MITprof.nc"
mp=MITprofStandard(fil)
interp(G.T[1],G.Γ,mp)ArgoData.GriddedFields.load — Methodload()Load gridded fields from files (and download them if needed), and return a dictionary (:Γ, :msk, :T, :S, :σT, :σS, :array, :tile).
note : the embeded array() function returns a 2D array initialized to missing, and array(1), array(3,2), etc add dimensions to the resulting array.
using Climatology, ArgoData
gridded_fields=GriddedFields.load()ArgoData.GriddedFields.monthly_climatology_factors — Methodmonthly_climatology_factors(date)If date is a DateTime, a vector of DateTime, or a date in days since DateTime(0) then compute the corresponding climatological months (1 to 12) and interpolation factors (0.0 to 1.0) and return result as fac0,fac1,rec0,rec1.
For example :
ff(x)=sin((x-0.5)/12*2pi)
(fac0,fac1,rec0,rec1)=monthly_climatology_factors(ArgoTools.DateTime(2011,1,10))
gg=fac0*ff(rec0)+fac1*ff(rec1)
(ff(rec0),gg,ff(rec1))Module: MITprofAnalysis
ArgoData.MITprofAnalysis.add_climatology_factors! — Methodadd_climatology_factors!(df)Add temporal interpolation factors (rec0,rec1,fac0,fac1) to DataFrame.
df=CSV.read("csv/profile_positions.csv",DataFrame)
MITprofAnalysis.add_climatology_factors!(df)ArgoData.MITprofAnalysis.add_coeffs! — Methodadd_coeffs!(df)Read profile_coeffs.jld2 and add to df.
df=MITprofAnalysis.read_pos_level(5)
MITprofAnalysis.add_coeffs!(df)ArgoData.MITprofAnalysis.add_level! — Methodadd_level!(df,k; path=default_path)Read from e.g. csv_levels/k1.csv and add variables to df.
df=CSV.read("csv/profile_positions.csv",DataFrame)
MITprofAnalysis.add_level!(df,5)ArgoData.MITprofAnalysis.add_tile! — Methodadd_tile!(df,Γ,n)Add tile index (see MeshArrays.Tiles) to df that can then be used with e.g. groupby.
input_file=joinpath("MITprof_input","profile_positions.csv")
df=CSV.read(input_file,DataFrame)
G=GriddedFields.load()
MITprofAnalysis.add_tile!(df,G.Γ,30)ArgoData.MITprofAnalysis.cost_functions — Functioncost_functions(vv="prof_T",JJ=[])Loop through files and compute nb profiles, nb non-blank profiles, nb levels mean, cost mean.
pth="MITprof/"
nt,np,nz,cost=MITprofAnalysis.cost_functions(pth,"prof_S")
using JLD2
jldsave(joinpath("csv","prof_S_stats.jld2"); nt,np,nz,cost)ArgoData.MITprofAnalysis.csv_of_levels — Functioncsv_of_levels()Create Array of all values for one level, obtained by looping through files in csv/.
ArgoData.MITprofAnalysis.csv_of_positions — Functioncsv_of_positions(path)Create table (DataFrame) of the positions and dates obtained by looping through files in path. Additional information such as float ID, position on the ECCO grid pos, number of valid data points for T and S (nbT ,nbS).
using ArgoData
using MeshArrays
Γ=GridLoad(ID=:LLC90)
path=MITprof.default_path
df=MITprofAnalysis.csv_of_positions(path,Γ)
csv_file=joinpath(default_path,"profile_positions.csv")
CSV.write(csv_file, df)ArgoData.MITprofAnalysis.csv_of_variables — Methodcsv_of_variables(name::String)Create Array of all values for one variable, obtained by looping through files in path.
@everywhere using ArgoData, CSV, DataFrames
@everywhere list_v=("prof_T","prof_Testim","prof_Tweight","prof_S","prof_Sestim","prof_Sweight")
@distributed for v in list_v
tmp=MITprofAnalysis.csv_of_variables(v)
CSV.write(output_file,DataFrame(tmp,:auto))
endArgoData.MITprofAnalysis.parse_pos — Methodparse_pos(p)Parse String vector p into a vector of CartesianIndex.
ArgoData.MITprofAnalysis.prepare_interpolation — Methodprepare_interpolation(Γ,lon,lat)Alias for InterpolationFactors(Γ,lon,lat).
The loop below creates interpolation coefficients for all data points.
The results are stored in a file called csv/profile_coeffs.jld2 at the end.
using SharedArrays, Distributed
@everywhere begin
using ArgoData, MeshArrays, CSV, DataFrames
Γ=GridLoad(ID=:LLC90,option=:full)
#df=MITprofAnalysis.read_pos_level(5)
pth="data/MITprof_combined"
csv_file=joinpath(pth,"profile_positions.csv")
df=CSV.read(csv_file,DataFrame)
np=size(df,1)
n0=10000
nq=Int(ceil(np/n0))
end
(f,i,j,w)=( SharedArray{Int64}(np,4), SharedArray{Int64}(np,4),
SharedArray{Int64}(np,4), SharedArray{Float64}(np,4) )
@sync @distributed for m in 1:nq
ii=n0*(m-1) .+collect(1:n0)
ii[end]>np ? ii=n0*(m-1) .+collect(1:n0+np-ii[end]) : nothing
tmp=MITprofAnalysis.prepare_interpolation(Γ,df.lon[ii],df.lat[ii])
f[ii,:].=tmp[1]
i[ii,:].=tmp[2]
j[ii,:].=tmp[3]
w[ii,:].=tmp[4]
end
fil=joinpath(pth,"profile_coeffs.jld2")
co=[(f=f[ii,:],i=i[ii,:],j=j[ii,:],w=w[ii,:]) for ii in 1:np]
save_object(fil,co)ArgoData.MITprofAnalysis.read_pos_level — Functionread_pos_level(k=1; path="")Read in from csv/profile_positions.csv and e.g. csv_levels/k1.csv, parse pos, then add_level!(df,k), and return a DataFrame.
df=MITprofAnalysis.read_pos_level(5)ArgoData.MITprofAnalysis.subset — Methodsubset(df;lons=(-180.0,180.0),lats=(-90.0,90.0),dates=())Subset of df that's within specified date and position ranges.
df=CSV.read("csv/profile_positions.csv",DataFrame)
d0=DateTime("2012-06-11T18:50:04")
d1=DateTime("2012-07-11T18:50:04")
df1=MITprofAnalysis.subset(df,dates=(d0,d1))
df2=MITprofAnalysis.subset(df,lons=(0,10),lats=(-5,5),dates=(d0,d1))ArgoData.MITprofAnalysis.trim — Methodtrim(df)Filter out data points that lack T, Te, etc.
df=CSV.read("csv/profile_positions.csv",DataFrame)
MITprofAnalysis.add_level!(df,1)
df1=MITprofAnalysis.trim(df,:T)ArgoData.MITprofAnalysis.trim_high_cost — Methodtrim_high_cost(df)Filter out data points that lack T, Te, etc.
df=CSV.read("csv/profile_positions.csv",DataFrame)
MITprofAnalysis.add_level!(df,1)
df=MITprofAnalysis.trim(df,:T)
df=MITprofAnalysis.trim_high_cost(df,:T,fcmax=5.0)Module: MITprofStat
ArgoData.MITprofStat.basic_config — Methodbasic_config(;preset=1)List of confiburations (each one a choice of nmon,npoint,nobs) to be used in stat_combine.
ArgoData.MITprofStat.combine_driver — Methodcombine_driver(; level=1,varia=:Td, output_path=tempdir(), output_format="MITgcm")Loop over all records, call stat_combine, write to netcdf file.
ArgoData.MITprofStat.stat_df — Methodstat_df(df::DataFrame,by::Symbol,va::Symbol)Compute statistics (mean, median, variance) of variable va from DataFrame df grouped by by.
ArgoData.MITprofStat.stat_driver — Methodstat_driver(;varia=:Td,level=1,years=2004:2022,
nmap=0, sta=:none, do_monthly_climatology=false, reference=:OCCA1,
output_to_file=false, output_path=default_path)Call stat_monthly! with parameters from the nmap line of joinpath(output_path,"argostats_mappings.nc").
P=( variable=:Td, level=10, nmap=1, sta=:mean,
output_to_file=true, output_path=MITprof.default_path)
MITprofStat.stat_driver(; varia=P.variable,level=P.level, nmap=P.nmap, sta=P.statistic,
reference=P.reference, output_path=P.output_path, output_to_file=P.output_to_file,
initial_adjustment="")ArgoData.MITprofStat.stat_grid! — Methodstat_grid!(ar::Array,df::DataFrame,va::Symbol,sta::Symbol; func=(x->x))Compute map ar of statistic sta of variable va from DataFrame df. This assumes that df.pos are indices into Array ar and should be used to groupby df.
ArgoData.MITprofStat.stat_monthly! — Methodstat_monthly!(arr:Array,df::DataFrame,va::Symbol,sta::Symbol,years,G::NamedTuple;
func=(x->x), nmon=1, nobs=1)Compute maps of statistic sta for variable va from DataFrame df for years years. This assumes that df.pos are indices into Array ar and should be used to groupby df. For each year in years, twelve fields are computed – one per month.
using ArgoData
G=GriddedFields.load()
df1=MITprofAnalysis.trim( MITprofAnalysis.read_pos_level(1, path="MITprof_input") , :T)
years=2004:2007
arr=G.array(12,length(years))
MITprofStat.stat_monthly!(arr,df1,:Td,:median,years,G,nmon=3);ArgoData.MITprofStat.stat_monthly! — Methodstat_monthly!(ar::Array,df::DataFrame,va::Symbol,sta::Symbol,y::Int,m::Int,G::NamedTuple;
func=(x->x), nmon=1, nobs=1)Compute map ar of statistic sta for variable va from DataFrame df for year y and month m. This assumes that df.pos are indices into Array ar and should be used to groupby df.
using ArgoData
G=GriddedFields.load();
P=( variable=:Td, level=10, year=2002, month=1, input_path=MITprof.default_path,
statistic=:median, nmon=3, rng=(-1.0,1.0))
df1=MITprofAnalysis.trim( MITprofAnalysis.read_pos_level(P.level,path=P.input_path), :T)
GriddedFields.update_tile!(G,P.npoint);
ar1=G.array();
MITprofStat.stat_monthly!(ar1,df1,P.variable,P.statistic,P.year,P.month,G,nmon=P.nmon);
MITprofPlots.stat_map(ar1,G,rng=P.rng)ArgoData.MITprofStat.stat_write — Methodstat_write(file,arr,varia)Module: Argo_parquet
ArgoData.Argo_parquet.Dataset — MethodDataset(folder_pq)Open dataset as a Argo_pq data structure
folder_pq=Argo_parquet.sample_download("ARGO_PHY_SAMPLE_QC")
da=Argo_parquet.Dataset(folder_pq)is equivalent to :
ds2 = Parquet2.Dataset(folder_pq)
### append all row groups (important step)
Parquet2.appendall!(ds2)
lst=Parquet2.filelist(ds2)
sch=Tables.schema(ds2)
Argo_pq(ds2,sch,lst,folder_pq)ArgoData.Argo_parquet.get_lon_lat_temp — Methodget_lon_lat_temp(df3)ArgoData.Argo_parquet.get_positions — Methodget_positions(df3)ArgoData.Argo_parquet.get_subset_float — Methodget_subset_float(ds2::Parquet2.Dataset ; ID=3901064,
variables=(:JULD, :LATITUDE, :LONGITUDE, :PRES, :TEMP, :PLATFORM_NUMBER))
rule_PLATFORM_NUMBER = x -> coalesce.(x, 0).==Ref(string(ID))ArgoData.Argo_parquet.get_subset_region — Methodget_subset_region(ds2::Parquet2.Dataset; lons=-75 .. -50, lats=25 .. 40,
dates=DateTime("2001-01-01T00:00:00") .. DateTime("2024-12-31T23:59:59"),
variables=(:JULD, :LATITUDE, :LONGITUDE, :PRES, :TEMP, :PLATFORM_NUMBER))ArgoData.Argo_parquet.sample_download — Functionsample_download(folder="ARGO_PHY_SAMPLE_QC")Get sample data set from archive.