Reference · ArgoData.jl

Content of this section:

Data Structures
Modules (1): GDAC, MITprof, GriddedFields
Modules (2): MITprofAnalysis, MITprofStat

Module: GDAC

ArgoData.GDAC.download_file — Function

download_file(file::DataFrameRow,suff="prof",ftp=missing)

Get folder and wmo from data frame row and them call download_file.

using ArgoData
files_list=GDAC.files_list()
file=GDAC.download_file(files_list[10000,:])

source

ArgoData.GDAC.download_file — Function

download_file(folder::String,wmo::Int,suff="prof",ftp="https://data-argo.ifremer.fr/dac/")

Download Argo file from the GDAC server (<https://data-argo.ifremer.fr/dac/> by default) to a temporary folder (joinpath(tempdir(),"Argo_DAC_files"))

The file name is given by folder and wmo. For example, 13857_prof.nc for wmo=13857 is from the aoml folder.

The default for suff is "prof" which means we'll download the file that contains the profile data. Other possible choices are "meta", "Rtraj", "tech".

If the ftp argument is omitted or isa(ftp,String) then Downloads.download is used. If, alternatively, isa(ftp,FTP) then FTPClient.download is used.

Example :

using ArgoData
GDAC.download_file("aoml",13857)

#or:
ftp="ftp://usgodae.org/pub/outgoing/argo/dac/"
#ftp="ftp://ftp.ifremer.fr/ifremer/argo/dac/"
GDAC.download_file("aoml",13857,"meta",ftp)

source

ArgoData.GDAC.files_list — Method

files_list(fil::String)

Get list of Argo float files from csv file with columns two columns – folder and wmo.

using ArgoData
fil="https://raw.githubusercontent.com/euroargodev/ArgoData.jl/gh-pages/dev/Argo_float_files.csv"
files_list=GDAC.files_list(fil)

source

ArgoData.GDAC.files_list — Method

files_list()

Get list of Argo float files from Ifremer GDAC server ftp://ftp.ifremer.fr/ifremer/argo/dac/

using ArgoData
files_list=GDAC.files_list()

source

ArgoData.GDAC.grey_list — Method

grey_list(fil::String)

Read "ar_greylist.txt" file into a DataFrame.

source

ArgoData.GDAC.grey_list — Method

grey_list()

Download "ar_greylist.txt" from GDAC and read file into a DataFrame.

source

Module: MITprof

ArgoData.MITprof.format — Function

format(meta,gridded_fields,input_file,output_file="")

From Argo file name as input : read input file content, process into the MITprof format, and write to MITprof file.

MITprof.format(meta,gridded_fields,input_file)

source

ArgoData.MITprof.format — Method

format(gridded_fields,input_file)

From Argo file name as input : read input file content, process into the MITprof format, and write to MITprof file.

MITprof.format(gridded_fields,input_file)

source

ArgoData.MITprof.format_loop — Method

format_loop(II)

Loop over files and call format.

gridded_fields=GriddedFields.load()
fil=joinpath(tempdir(),"Argo_MITprof_files","input","Argo_float_files.csv")
files_list=GDAC.files_list(fil)
MITprof.format_loop(gridded_fields,files_list,1:10)

source

ArgoData.MITprof.write — Method

MITprof.write(meta,profiles,profiles_std;path="")

Create an MITprof file from meta data + profiles during MITprof.format.

MITprof.write(meta,profiles,profiles_std)

source

ArgoData.MITprof.write — Method

write(fil::String,mp::MITprofStandard)

Create an MITprof file from an MITprofStandard input.

source

ArgoData.MITprof.write — Method

write(fil::String,mps::Vector{MITprofStandard})

Create an MITprof file from a vector of MITprofStandard inputs.

source

Module: GriddedFields

ArgoData.GriddedFields.interp! — Method

interp!(T_in::MeshArray,Γ,mp::MITprofStandard,📚,T_out)
interp!(T_in::MeshArray,Γ,mp::MITprofStandard,T_out)

Interpolate T_in, defined on grid Γ, to locations speficied in mp and store the result in array T_out.

Providing interpolation coefficients 📚 computed beforehand speeds up repeated calls.

Example:

fil=glob("*_MITprof.nc","MITprof")[1000]
mp=MITprofStandard(fil)

(f,i,j,w)=InterpolationFactors(G.Γ,mp.lon[:],mp.lat[:]);
📚=(f=f,i=i,j=j,w=w)

T=[similar(mp.T) for i in 1:12]
[interp!(G.T[i],G.Γ,mp,📚,T[i]) for i in 1:12]

In the above example, [interp!(G.T[i],G.Γ,mp,T[i]) for i in 1:12] would be much slower.

source

ArgoData.GriddedFields.interp — Method

interp(T_in::MeshArray,Γ,mp::MITprofStandard)

Interpolate T_in, defined on grid Γ, to locations speficied in mp.

For a more efficient, in place, option see interp!.

fil="MITprof/1901238_MITprof.nc"
mp=MITprofStandard(fil)

interp(G.T[1],G.Γ,mp)

source

ArgoData.GriddedFields.load — Method

load()

Load gridded fields from files (and download them if needed), and return a dictionary (:Γ, :msk, :T, :S, :σT, :σS, :array, :tile).

note : the embeded array() function returns a 2D array initialized to missing, and array(1), array(3,2), etc add dimensions to the resulting array.

using Climatology, ArgoData
gridded_fields=GriddedFields.load()

source

ArgoData.GriddedFields.monthly_climatology_factors — Method

monthly_climatology_factors(date)

If date is a DateTime, a vector of DateTime, or a date in days since DateTime(0) then compute the corresponding climatological months (1 to 12) and interpolation factors (0.0 to 1.0) and return result as fac0,fac1,rec0,rec1.

For example :

ff(x)=sin((x-0.5)/12*2pi)
(fac0,fac1,rec0,rec1)=monthly_climatology_factors(ArgoTools.DateTime(2011,1,10))

gg=fac0*ff(rec0)+fac1*ff(rec1)
(ff(rec0),gg,ff(rec1))

source

Module: MITprofAnalysis

ArgoData.MITprofAnalysis.add_climatology_factors! — Method

add_climatology_factors!(df)

Add temporal interpolation factors (rec0,rec1,fac0,fac1) to DataFrame.

df=CSV.read("csv/profile_positions.csv",DataFrame)
MITprofAnalysis.add_climatology_factors!(df)

source

ArgoData.MITprofAnalysis.add_coeffs! — Method

add_coeffs!(df)

Read profile_coeffs.jld2 and add to df.

df=MITprofAnalysis.read_pos_level(5)
MITprofAnalysis.add_coeffs!(df)

source

ArgoData.MITprofAnalysis.add_level! — Method

add_level!(df,k; path=default_path)

Read from e.g. csv_levels/k1.csv and add variables to df.

df=CSV.read("csv/profile_positions.csv",DataFrame)
MITprofAnalysis.add_level!(df,5)

source

ArgoData.MITprofAnalysis.add_tile! — Method

add_tile!(df,Γ,n)

Add tile index (see MeshArrays.Tiles) to df that can then be used with e.g. groupby.

input_file=joinpath("MITprof_input","profile_positions.csv")
df=CSV.read(input_file,DataFrame)
G=GriddedFields.load()
MITprofAnalysis.add_tile!(df,G.Γ,30)

source

ArgoData.MITprofAnalysis.cost_functions — Function

cost_functions(vv="prof_T",JJ=[])

Loop through files and compute nb profiles, nb non-blank profiles, nb levels mean, cost mean.

pth="MITprof/"
nt,np,nz,cost=MITprofAnalysis.cost_functions(pth,"prof_S")

using JLD2
jldsave(joinpath("csv","prof_S_stats.jld2"); nt,np,nz,cost)

source

ArgoData.MITprofAnalysis.csv_of_levels — Function

csv_of_levels()

Create Array of all values for one level, obtained by looping through files in csv/.

source

ArgoData.MITprofAnalysis.csv_of_positions — Function

csv_of_positions(path)

Create table (DataFrame) of the positions and dates obtained by looping through files in path. Additional information such as float ID, position on the ECCO grid pos, number of valid data points for T and S (nbT ,nbS).

using ArgoData
using MeshArrays
Γ=GridLoad(ID=:LLC90)
path=MITprof.default_path
df=MITprofAnalysis.csv_of_positions(path,Γ)
csv_file=joinpath(default_path,"profile_positions.csv")
CSV.write(csv_file, df)

source

ArgoData.MITprofAnalysis.csv_of_variables — Method

csv_of_variables(name::String)

Create Array of all values for one variable, obtained by looping through files in path.

@everywhere using ArgoData, CSV, DataFrames
@everywhere list_v=("prof_T","prof_Testim","prof_Tweight","prof_S","prof_Sestim","prof_Sweight")
@distributed for v in list_v
    tmp=MITprofAnalysis.csv_of_variables(v)
    CSV.write(output_file,DataFrame(tmp,:auto))
end

source

ArgoData.MITprofAnalysis.parse_pos — Method

parse_pos(p)

Parse String vector p into a vector of CartesianIndex.

source

ArgoData.MITprofAnalysis.prepare_interpolation — Method

prepare_interpolation(Γ,lon,lat)

Alias for InterpolationFactors(Γ,lon,lat).

The loop below creates interpolation coefficients for all data points.

The results are stored in a file called csv/profile_coeffs.jld2 at the end.

using SharedArrays, Distributed

@everywhere begin
    using ArgoData, MeshArrays, CSV, DataFrames
    Γ=GridLoad(ID=:LLC90,option=:full)
    
    #df=MITprofAnalysis.read_pos_level(5)
    pth="data/MITprof_combined"
    csv_file=joinpath(pth,"profile_positions.csv")
    df=CSV.read(csv_file,DataFrame)

    np=size(df,1)
    n0=10000
    nq=Int(ceil(np/n0))
end

(f,i,j,w)=( SharedArray{Int64}(np,4), SharedArray{Int64}(np,4),
            SharedArray{Int64}(np,4), SharedArray{Float64}(np,4) )

@sync @distributed for m in 1:nq
    ii=n0*(m-1) .+collect(1:n0)
    ii[end]>np ? ii=n0*(m-1) .+collect(1:n0+np-ii[end]) : nothing
    tmp=MITprofAnalysis.prepare_interpolation(Γ,df.lon[ii],df.lat[ii])
    f[ii,:].=tmp[1]
    i[ii,:].=tmp[2]
    j[ii,:].=tmp[3]
    w[ii,:].=tmp[4]
end

fil=joinpath(pth,"profile_coeffs.jld2")
co=[(f=f[ii,:],i=i[ii,:],j=j[ii,:],w=w[ii,:]) for ii in 1:np]
save_object(fil,co)

source

ArgoData.MITprofAnalysis.read_pos_level — Function

read_pos_level(k=1; path="")

Read in from csv/profile_positions.csv and e.g. csv_levels/k1.csv, parse pos, then add_level!(df,k), and return a DataFrame.

df=MITprofAnalysis.read_pos_level(5)

source

ArgoData.MITprofAnalysis.subset — Method

subset(df;lons=(-180.0,180.0),lats=(-90.0,90.0),dates=())

Subset of df that's within specified date and position ranges.

df=CSV.read("csv/profile_positions.csv",DataFrame)
d0=DateTime("2012-06-11T18:50:04")
d1=DateTime("2012-07-11T18:50:04")
df1=MITprofAnalysis.subset(df,dates=(d0,d1))
df2=MITprofAnalysis.subset(df,lons=(0,10),lats=(-5,5),dates=(d0,d1))

source

ArgoData.MITprofAnalysis.trim — Method

trim(df)

Filter out data points that lack T, Te, etc.

df=CSV.read("csv/profile_positions.csv",DataFrame)
MITprofAnalysis.add_level!(df,1)
df1=MITprofAnalysis.trim(df,:T)

source

ArgoData.MITprofAnalysis.trim_high_cost — Method

trim_high_cost(df)

Filter out data points that lack T, Te, etc.

df=CSV.read("csv/profile_positions.csv",DataFrame)
MITprofAnalysis.add_level!(df,1)
df=MITprofAnalysis.trim(df,:T)
df=MITprofAnalysis.trim_high_cost(df,:T,fcmax=5.0)

source

Module: MITprofStat

ArgoData.MITprofStat.basic_config — Method

basic_config(;preset=1)

List of confiburations (each one a choice of nmon,npoint,nobs) to be used in stat_combine.

source

ArgoData.MITprofStat.combine_driver — Method

combine_driver(; level=1,varia=:Td, output_path=tempdir(), output_format="MITgcm")

Loop over all records, call stat_combine, write to netcdf file.

source

ArgoData.MITprofStat.stat_df — Method

stat_df(df::DataFrame,by::Symbol,va::Symbol)

Compute statistics (mean, median, variance) of variable va from DataFrame df grouped by by.

source

ArgoData.MITprofStat.stat_driver — Method

stat_driver(;varia=:Td,level=1,years=2004:2022,
    nmap=0, sta=:none, do_monthly_climatology=false, reference=:OCCA1,
    output_to_file=false, output_path=default_path)

Call stat_monthly! with parameters from the nmap line of joinpath(output_path,"argostats_mappings.nc").

P=( variable=:Td, level=10, nmap=1, sta=:mean, 
    output_to_file=true, output_path=MITprof.default_path)

MITprofStat.stat_driver(; varia=P.variable,level=P.level, nmap=P.nmap, sta=P.statistic, 
        reference=P.reference, output_path=P.output_path, output_to_file=P.output_to_file,
        initial_adjustment="")

source

ArgoData.MITprofStat.stat_grid! — Method

stat_grid!(ar::Array,df::DataFrame,va::Symbol,sta::Symbol; func=(x->x))

Compute map ar of statistic sta of variable va from DataFrame df. This assumes that df.pos are indices into Array ar and should be used to groupby df.

source

ArgoData.MITprofStat.stat_monthly! — Method

stat_monthly!(arr:Array,df::DataFrame,va::Symbol,sta::Symbol,years,G::NamedTuple;
                func=(x->x), nmon=1, nobs=1)

Compute maps of statistic sta for variable va from DataFrame df for years years. This assumes that df.pos are indices into Array ar and should be used to groupby df. For each year in years, twelve fields are computed – one per month.

using ArgoData
G=GriddedFields.load()
df1=MITprofAnalysis.trim( MITprofAnalysis.read_pos_level(1, path="MITprof_input") , :T)

years=2004:2007
arr=G.array(12,length(years))
MITprofStat.stat_monthly!(arr,df1,:Td,:median,years,G,nmon=3);

source

ArgoData.MITprofStat.stat_monthly! — Method

stat_monthly!(ar::Array,df::DataFrame,va::Symbol,sta::Symbol,y::Int,m::Int,G::NamedTuple;
                func=(x->x), nmon=1, nobs=1)

Compute map ar of statistic sta for variable va from DataFrame df for year y and month m. This assumes that df.pos are indices into Array ar and should be used to groupby df.

using ArgoData
G=GriddedFields.load();

P=( variable=:Td, level=10, year=2002, month=1, input_path=MITprof.default_path,
    statistic=:median, nmon=3, rng=(-1.0,1.0))

df1=MITprofAnalysis.trim( MITprofAnalysis.read_pos_level(P.level,path=P.input_path), :T)

GriddedFields.update_tile!(G,P.npoint);
ar1=G.array();
MITprofStat.stat_monthly!(ar1,df1,P.variable,P.statistic,P.year,P.month,G,nmon=P.nmon);

MITprofPlots.stat_map(ar1,G,rng=P.rng)

source

ArgoData.MITprofStat.stat_write — Method

stat_write(file,arr,varia)

source

Module: Argo_parquet

ArgoData.Argo_parquet.Dataset — Method

Dataset(folder_pq)

Open dataset as a Argo_pq data structure

folder_pq=Argo_parquet.sample_download("ARGO_PHY_SAMPLE_QC")
da=Argo_parquet.Dataset(folder_pq)

is equivalent to :

ds2 = Parquet2.Dataset(folder_pq)
### append all row groups (important step)
Parquet2.appendall!(ds2)
lst=Parquet2.filelist(ds2)
sch=Tables.schema(ds2)
Argo_pq(ds2,sch,lst,folder_pq)

source

ArgoData.Argo_parquet.get_lon_lat_temp — Method

get_lon_lat_temp(df3)

source

ArgoData.Argo_parquet.get_positions — Method

get_positions(df3)

source

ArgoData.Argo_parquet.get_subset_float — Method

get_subset_float(ds2::Parquet2.Dataset ; ID=3901064,
variables=(:JULD, :LATITUDE, :LONGITUDE, :PRES, :TEMP, :PLATFORM_NUMBER))
rule_PLATFORM_NUMBER = x -> coalesce.(x, 0).==Ref(string(ID))

source

ArgoData.Argo_parquet.get_subset_region — Method

get_subset_region(ds2::Parquet2.Dataset; lons=-75 .. -50, lats=25 .. 40, 
dates=DateTime("2001-01-01T00:00:00") .. DateTime("2024-12-31T23:59:59"),
variables=(:JULD, :LATITUDE, :LONGITUDE, :PRES, :TEMP, :PLATFORM_NUMBER))

source

ArgoData.Argo_parquet.sample_download — Function

sample_download(folder="ARGO_PHY_SAMPLE_QC")

Get sample data set from archive.

source

Data Structures

Module: GDAC

Module: MITprof

Module: GriddedFields

Module: MITprofAnalysis

Module: MITprofStat

Module: Argo_parquet