NetCDF  4.5.0
auth.md
1 netCDF Authorization Support
2 ======================================
3 <!-- double header is needed to workaround doxygen bug -->
4 
5 # netCDF Authorization Support {#Header}
6 
7 __Author__: Dennis Heimbigner<br>
8 __Initial Version__: 11/21/2014<br>
9 __Last Revised__: 08/24/2017
10 
11 [TOC]
12 
13 ## Introduction {#Introduction}
14 
15 netCDF can support user authorization using the facilities provided by the curl
16 library. This includes basic password authentication as well as
17 certificate-based authorization.
18 
19 At the moment, this document only applies to DAP2 and DAP4 access
20 because they are (for now) the only parts of the netCDF-C library
21 that uses libcurl.
22 
23 With some exceptions (e.g. see the section on <a href="#REDIR">redirection</a>)
24 The libcurl authorization mechanisms can be accessed in two ways
25 
26 1. Inserting the username and password into the url, or
27 2. Accessing information from a so-called _rc_ file named either
28  `.daprc` or `.dodsrc`
29 
30 ## URL-Based Authentication {#URLAUTH}
31 
32 For simple password based authentication, it is possible to
33 directly insert the username and the password into a url in this form.
34 
35  http://username:password@host/...
36 
37 This username and password will be used if the server asks for
38 authentication. Note that only simple password authentication
39 is supported in this format.
40 
41 Specifically note that [redirection-based](#REDIR)
42 authorization may work with this but it is a security risk.
43 This is because the username and password
44 may be sent to each server in the redirection chain.
45 
46 Note also that the `user:password` form may contain characters that must be
47 escaped. See the <a href="#USERPWDESCAPE">password escaping</a> section to see
48 how to properly escape the user and password.
49 
50 ## RC File Authentication {#DODSRC}
51 The netcdf library supports an _rc_ file mechanism to allow the passing
52 of a number of parameters to libnetcdf and libcurl.
53 Locating the _rc_ file is a multi-step process.
54 
55 ### Search Order
56 
57 The file must be called one of the following names:
58 ".daprc" or ".dodsrc".
59 If both ".daprc" and ".dodsrc" exist, then
60 the ".daprc" file will take precedence.
61 
62 It is strongly suggested that you pick one of the two names
63 and use it always. Otherwise you may observe unexpected results
64 when the netcdf-c library finds one that you did not intend.
65 
66 The search for an _rc_ file looks in the following places in this order.
67 
68 1. Check for the environment variable named _DAPRCFILE_.
69  This will specify the full path for the _rc_ file
70  (not just the containing directory).
71 2. Search the current working directory (`./`) looking
72  for (in order) .daprc or .dodsrc.
73 3. Search the HOME directory (`$HOME`) looking
74  for (in order) .daprc or .dodsrc. The HOME environment
75  variable is used to define the directory in which to search.
76 
77 It is strongly suggested that you pick a uniform location
78 and use it always. Otherwise you may observe unexpected results
79 when the netcdf-c library get an rc file you did not expect.
80 
81 ### RC File Format
82 
83 The rc file format is a series of lines of the general form:
84 
85  [<host:port>]<key>=<value>
86 
87 where the bracket-enclosed host:port is optional.
88 
89 ### URL Constrained RC File Entries
90 
91 Each line of the rc file can begin with
92 a host+port enclosed in square brackets.
93 The form is "host:port".
94 If the port is not specified
95 then the form is just "host".
96 The reason that more of the url is not used is that
97 libcurl's authorization grain is not any finer than host level.
98 
99 Examples.
100 
101  [remotetest.unidata.ucar.edu]HTTP.VERBOSE=1
102 
103 or
104 
105  [fake.ucar.edu:9090]HTTP.VERBOSE=0
106 
107 If the url request from, say, the _netcdf_open_ method
108 has a host+port matching one of the prefixes in the rc file, then
109 the corresponding entry will be used, otherwise ignored.
110 This means that an entry with a matching host+port will take
111 precedence over an entry without a host+port.
112 
113 For example, the URL
114 
115  http://remotetest.unidata.ucar.edu/thredds/dodsC/testdata/testData.nc
116 
117 will have HTTP.VERBOSE set to 1 because its host matches the example above.
118 
119 Similarly,
120 
121  http://fake.ucar.edu:9090/dts/test.01
122 
123 will have HTTP.VERBOSE set to 0 because its host+port matches the example above.
124 
125 ## Authorization-Related Keys {#AUTHKEYS}
126 
127 The currently defined set of authorization-related keys are as follows.
128 The second column is the affected curl_easy_setopt option(s), if any.
129 <table>
130 <tr><th>Key</th><th>Affected curl_easy_setopt Options</th><th>Notes</th>
131 <tr><td>HTTP.COOKIEJAR</td><td>CURLOPT_COOKIEJAR</td>
132 <tr><td>HTTP.COOKIEFILE</td><td>CURLOPT_COOKIEJAR</td><td>Alias for CURLOPT_COOKIEJAR</td>
133 <tr><td>HTTP.PROXY_SERVER</td><td>CURLOPT_PROXY, CURLOPT_PROXYPORT, CURLOPT_PROXYUSERPWD</td>
134 <tr><td>HTTP.SSL.CERTIFICATE</td><td>CURLOPT_SSLCERT</td>
135 <tr><td>HTTP.SSL.KEY</td><td>CURLOPT_SSLKEY</td>
136 <tr><td>HTTP.SSL.KEYPASSWORD</td><td>CURLOPT_KEYPASSWORD</td>
137 <tr><td>HTTP.SSL.CAINFO</td><td>CURLOPT_CAINFO</td>
138 <tr><td>HTTP.SSL.CAPATH</td><td>CURLOPT_CAPATH</td>
139 <tr><td>HTTP.SSL.VERIFYPEER</td><td>CURLOPT_SSL_VERIFYPEER</td>
140 <tr><td>HTTP.SSL.VALIDATE</td><td>CURLOPT_SSL_VERIFYPEER, CURLOPT_SSL_VERIFYHOST</td>
141 <tr><td>HTTP.CREDENTIALS.USERPASSWORD</td><td>CURLOPT_USERPASSWORD</td>
142 <tr><td>HTTP.NETRC</td><td>CURLOPT_NETRC,CURLOPT_NETRC_FILE</td>
143 </table>
144 
145 ### Password Authentication
146 
147 The key
148 HTTP.CREDENTIALS.USERPASSWORD
149 can be used to set the simple password authentication.
150 This is an alternative to setting it in the url.
151 The value must be of the form "username:password".
152 See the <a href="#USERPWDESCAPE">password escaping</a> section
153 to see how this value must escape certain characters.
154 Also see <a href="#REDIR">redirection authorization</a>
155 for important additional information.
156 
157 ### Cookie Jar
158 
159 The HTTP.COOKIEJAR key
160 specifies the name of file from which
161 to read cookies (CURLOPT_COOKIEJAR) and also
162 the file into which to store cookies (CURLOPT_COOKIEFILE).
163 The same value is used for both CURLOPT values.
164 It defaults to in-memory storage.
165 See [redirection authorization](#REDIR)
166 for important additional information.
167 
168 ### Certificate Authentication
169 
170 HTTP.SSL.CERTIFICATE
171 specifies a file path for a file containing a PEM cerficate.
172 This is typically used for client-side authentication.
173 
174 HTTP.SSL.KEY is essentially the same as HTTP.SSL.CERTIFICATE
175 and should always have the same value.
176 
177 HTTP.SSL.KEYPASSWORD
178 specifies the password for accessing the HTTP.SSL.CERTIFICAT/HTTP.SSL.key file.
179 
180 HTTP.SSL.CAPATH
181 specifies the path to a directory containing
182 trusted certificates for validating server sertificates.
183 
184 HTTP.SSL.VALIDATE
185 is a boolean (1/0) value that if true (1)
186 specifies that the client should verify the server's presented certificate.
187 
188 HTTP.PROXY_SERVER
189 specifies the url for accessing the proxy:
190 e.g. *http://[username:password@]host[:port]*
191 
192 HTTP.NETRC
193 specifies the absolute path of the .netrc file.
194 See [redirection authorization](#REDIR)
195 for information about using .netrc.
196 
197 ## Password Escaping {#USERPWDESCAPE}
198 
199 With current password rules, it is is not unlikely that the password
200 will contain characters that need to be escaped. Similarly, the user
201 may contain characters such as '@' that need to be escaped. To support this,
202 it is assumed that all occurrences of `user:password` use URL (i.e. %%XX)
203 escaping for at least the characters in the table below.
204 
205 The minimum set of characters that must be escaped depends on the location.
206 If the user+pwd is embedded in the URL, then '@' and ':' __must__ be escaped.
207 If the user+pwd is the value for
208 the HTTP.CREDENTIALS.USERPASSWORD key in the _rc_ file, then
209 ':' __must__ be escaped.
210 Escaping should __not__ be used in the `.netrc` file.
211 
212 The relevant escape codes are as follows.
213 <table>
214 <tr><th>Character</th><th>Escaped Form</th>
215 <tr><td>'@'</td><td>%40</td>
216 <tr><td>':'</td><td>%3a</td>
217 </table>
218 Additional characters can be escaped if desired.
219 
220 ## Redirection-Based Authentication {#REDIR}
221 
222 Some sites provide authentication by using a third party site
223 to do the authentication. Examples include ESG, URS, RDA, and most oauth2-based
224 systems.
225 
226 The process is usually as follows.
227 
228 1. The client contacts the server of interest (SOI), the actual data provider
229 using, typically _http_ protocol.
230 2. The SOI sends a redirect to the client to connect to the e.g. URS system
231 using the _https_ protocol (note the use of _https_ instead of _http_).
232 3. The client authenticates with URS.
233 4. URS sends a redirect (with authorization information) to send
234 the client back to the SOI to actually obtain the data.
235 
236 It turns out that libcurl, by default, uses the password in the
237 `.daprc` file (or from the url) for all connections that request
238 a password. This causes problems because only the the specific
239 redirected connection is the one that actually requires the password.
240 This is where the `.netrc` file comes in. Libcurl will use `.netrc`
241 for the redirected connection. It is possible to cause libcurl
242 to use the `.daprc` password always, but this introduces a
243 security hole because it may send the initial user+pwd to every
244 server in the redirection chain.
245 In summary, if you are using redirection, then you are
246 ''strongly'' encouraged to create a `.netrc` file to hold the
247 password for the site to which the redirection is sent.
248 
249 The format of this `.netrc` file will contain lines that
250 typically look like this.
251 
252  machine mmmmmm login xxxxxx password yyyyyy
253 
254 where the machine, mmmmmm, is the hostname of the machine to
255 which the client is redirected for authorization, and the
256 login and password are those needed to authenticate on that machine.
257 
258 The location of the `.netrc` file can be specified by
259 putting the following line in your `.daprc`/`.dodsrc` file.
260 
261  HTTP.NETRC=<path to netrc file>
262 
263 If not specified, then libcurl will look first in the current
264 directory, and then in the HOME directory.
265 
266 One final note. In using this, you MUST
267 to specify a real file in the file system to act as the
268 cookie jar file (HTTP.COOKIEJAR) so that the
269 redirect site can properly pass back authorization information.
270 
271 ## Client-Side Certificates {#CLIENTCERTS}
272 
273 Some systems, notably ESG (Earth System Grid), requires
274 the use of client-side certificates, as well as being
275 [re-direction based](#REDIR).
276 This requires setting the following entries:
277 
278 - HTTP.COOKIEJAR &mdash; a file path for storing cookies across re-direction.
279 - HTTP.NETRC &mdash; the path to the netrc file.
280 - HTTP.SSL.CERTIFICATE &mdash; the file path for the client side certificate file.
281 - HTTP.SSL.KEY &mdash; this should have the same value as HTTP.SSL.CERTIFICATE.
282 - HTTP.SSL.CAPATH &mdash; the path to a "certificates" directory.
283 - HTTP.SSL.VALIDATE &mdash; force validation of the server certificate.
284 
285 Note that the first two are there to support re-direction based authentication.
286 
287 ## Appendix A. All RC-File Keys {#allkeys}
288 
289 For completeness, this is the list of all rc-file keys.
290 If this documentation is out of date with respect to the actual code,
291 the code is definitive.
292 <table>
293 <tr><th>Key</th><th>curl_easy_setopt Option</th>
294 <tr valign="top"><td>HTTP.DEFLATE</td><td>CUROPT_DEFLATE<br>with value "deflate,gzip"</td>
295 <tr><td>HTTP.VERBOSE</td><td>CUROPT_VERBOSE</td>
296 <tr><td>HTTP.TIMEOUT</td><td>CUROPT_TIMEOUT</td>
297 <tr><td>HTTP.USERAGENT</td><td>CUROPT_USERAGENT</td>
298 <tr><td>HTTP.COOKIEJAR</td><td>CUROPT_COOKIEJAR</td>
299 <tr><td>HTTP.COOKIE_JAR</td><td>CUROPT_COOKIEJAR</td>
300 <tr valign="top"><td>HTTP.PROXY_SERVER</td><td>CURLOPT_PROXY,<br>CURLOPT_PROXYPORT,<br>CURLOPT_PROXYUSERPWD</td>
301 <tr><td>HTTP.SSL.CERTIFICATE</td><td>CUROPT_SSLCERT</td>
302 <tr><td>HTTP.SSL.KEY</td><td>CUROPT_SSLKEY</td>
303 <tr><td>HTTP.SSL.KEYPASSWORD</td><td>CUROPT_KEYPASSWORD</td>
304 <tr><td>HTTP.SSL.CAINFO</td><td>CUROPT_CAINFO</td>
305 <tr><td>HTTP.SSL.CAPATH</td><td>CUROPT_CAPATH</td>
306 <tr><td>HTTP.SSL.VERIFYPEER</td><td>CUROPT_SSL_VERIFYPEER</td>
307 <tr><td>HTTP.CREDENTIALS.USERPASSWORD</td><td>CUROPT_USERPASSWORD</td>
308 <tr><td>HTTP.NETRC</td><td>CURLOPT_NETRC,CURLOPT_NETRC_FILE</td>
309 </table>
310 
311 ## Appendix B. URS Access in Detail {#URSDETAIL}
312 
313 It is possible to use the NASA Earthdata Login System (URS)
314 with netcdf by using using the process specified in the
315 [redirection based authorization section](#REDIR).
316 In order to access URS controlled datasets, however, it is necessary to
317 register as a user with NASA at this website (subject to change):
318 
319  https://uat.urs.earthdata.nasa.gov/
320 
321 ## Appendix C. ESG Access in Detail {#ESGDETAIL}
322 
323 It is possible to access Earth Systems Grid (ESG) datasets
324 from ESG servers through the netCDF API using the techniques
325 described in the section on [Client-Side Certificates](#CLIENTCERTS).
326 
327 In order to access ESG datasets, however, it is necessary to
328 register as a user with ESG and to setup your environment
329 so that proper authentication is established between an netcdf
330 client program and the ESG data server. Specifically, it
331 is necessary to use what is called "client-side keys" to
332 enable this authentication. Normally, when a client accesses
333 a server in a secure fashion (using "https"), the server
334 provides an authentication certificate to the client.
335 With client-side keys, the client must also provide a
336 certificate to the server so that the server can know with
337 whom it is communicating. Note that this section is subject
338 to change as ESG changes its procedures.
339 
340 The netcdf library uses the _curl_ library and it is that
341 underlying library that must be properly configured.
342 
343 ### Terminology
344 
345 The key elements for client-side keys requires the constructions of
346 two "stores" on the client side.
347 
348 * Keystore - a repository to hold the client side key.
349 * Truststore - a repository to hold a chain of certificates
350 that can be used to validate the certificate
351 sent by the server to the client.
352 
353 The server actually has a similar set of stores, but the client
354 need not be concerned with those.
355 
356 ### Initial Steps
357 
358 The first step is to obtain authorization from ESG.
359 Note that this information may evolve over time, and
360 may be out of date.
361 This discussion is in terms of BADC and NCSA. You will need
362 to substitute as necessary.
363 
364 1. Register at http://badc.nerc.ac.uk/register
365  to obtain access to badc and to obtain an openid,
366  which will looks something like:
367  <pre>https://ceda.ac.uk/openid/Firstname.Lastname</pre>
368 
369 2. Ask BADC for access to whatever datasets are of interest.
370 
371 3. Obtain short term credentials at
372  _http://grid.ncsa.illinois.edu/myproxy/MyProxyLogon/_
373  You will need to download and run the MyProxyLogon program.
374  This will create a keyfile in, typically, the directory ".globus".
375  The keyfile will have a name similar to this: "x509up_u13615"
376  The other elements in ".globus" are certificates to use in
377  validating the certificate your client gets from the server.
378 
379 4. Obtain the program source ImportKey.java
380  from this location: _http://www.agentbob.info/agentbob/79-AB.html_
381  (read the whole page, it will help you understand the remaining steps).
382 
383 ### Building the KeyStore
384 
385 You will have to modify the keyfile in the previous step
386 and then create a keystore and install the key and a certificate.
387 The commands are these:
388 
389  openssl pkcs8 -topk8 -nocrypt -in x509up_u13615 -inform PEM -out key.der -outform DER
390  openssl x509 -in x509up_u13615 -inform PEM -out cert.der -outform DER
391  java -classpath <path to ImportKey.class> -Dkeypassword="<password>" -Dkeystore=./<keystorefilename> key.der cert.der
392 
393 Note, the file names "key.der" and "cert.der" can be whatever you choose.
394 It is probably best to leave the .der extension, though.
395 
396 ### Building the TrustStore
397 
398 Building the truststore is a bit tricky because as provided, the
399 certificates in ".globus" need some massaging. See the script below
400 for the details. The primary command is this, which is executed for every
401 certificate, c, in globus. It sticks the certificate into the file
402 named "truststore"
403 
404  keytool -trustcacerts -storepass "password" -v -keystore "truststore" -importcert -file "${c}"
405 
406 ### Running the C Client
407 
408 Refer to the section on [Client-Side Certificates](#CLIENTCERTS).
409 The keys specified there must be set in the rc file to support ESG access.
410 
411 - HTTP.COOKIEJAR=~/.dods_cookies
412 - HTTP.NETRC=~/.netrc
413 - HTTP.SSL.CERTIFICATE=~/esgkeystore
414 - HTTP.SSL.KEY=~/esgkeystore
415 - HTTP.SSL.CAPATH=~/.globus
416 - HTTP.SSL.VALIDATE=1
417 
418 Of course, the file paths above are suggestions only;
419 you can modify as needed.
420 The HTTP.SSL.CERTIFICATE and HTTP.SSL.KEY
421 entries should have same value, which is the file path for the
422 certificate produced by MyProxyLogon. The HTTP.SSL.CAPATH entry
423 should be the path to the "certificates" directory produced by
424 MyProxyLogon.
425 
426 As noted, ESG also uses re-direction based authentication.
427 So, when it receives an initial connection from a client, it
428 redirects to a separate authentication server. When that
429 server has authenticated the client, it redirects back to
430 the original url to complete the request.
431 
432 ### Script for creating Stores
433 
434 The following script shows in detail how to actually construct the key
435 and trust stores. It is specific to the format of the globus file
436 as it was when ESG support was first added. It may have changed
437 since then, in which case, you will need to seek some help
438 in fixing this script. It would help if you communicated
439 what you changed to the author so this document can be updated.
440 
441  #!/bin/sh -x
442  KEYSTORE="esgkeystore"
443  TRUSTSTORE="esgtruststore"
444  GLOBUS="globus"
445  TRUSTROOT="certificates"
446  CERT="x509up_u13615"
447  TRUSTROOTPATH="$GLOBUS/$TRUSTROOT"
448  CERTFILE="$GLOBUS/$CERT"
449  PWD="password"
450 
451  D="-Dglobus=$GLOBUS"
452  CCP="bcprov-jdk16-145.jar"
453  CP="./build:${CCP}"
454  JAR="myproxy.jar"
455 
456  # Initialize needed directories
457  rm -fr build
458  mkdir build
459  rm -fr $GLOBUS
460  mkdir $GLOBUS
461  rm -f $KEYSTORE
462  rm -f $TRUSTSTORE
463 
464  # Compile MyProxyCmd and ImportKey
465  javac -d ./build -classpath "$CCP" *.java
466  javac -d ./build ImportKey.java
467 
468  # Execute MyProxyCmd
469  java -cp "$CP myproxy.MyProxyCmd
470 
471  # Build the keystore
472  openssl pkcs8 -topk8 -nocrypt -in $CERTFILE -inform PEM -out key.der -outform DER
473  openssl x509 -in $CERTFILE -inform PEM -out cert.der -outform DER
474  java -Dkeypassword=$PWD -Dkeystore=./${KEYSTORE} -cp ./build ImportKey key.der cert.der
475 
476  # Clean up the certificates in the globus directory
477  for c in ${TRUSTROOTPATH}/*.0 ; do
478  alias=`basename $c .0`
479  sed -e '0,/---/d' <$c >/tmp/${alias}
480  echo "-----BEGIN CERTIFICATE-----" >$c
481  cat /tmp/${alias} >>$c
482  done
483 
484  # Build the truststore
485  for c in ${TRUSTROOTPATH}/*.0 ; do
486  alias=`basename $c .0`
487  echo "adding: $TRUSTROOTPATH/${c}"
488  echo "alias: $alias"
489  yes | keytool -trustcacerts -storepass "$PWD" -v -keystore ./$TRUSTSTORE -alias $alias -importcert -file "${c}"
490  done
491  exit

Return to the Main Unidata NetCDF page.
Generated on Thu Oct 26 2017 08:14:39 for NetCDF. NetCDF is a Unidata library.