Welcome!
This is a place for errata broadly related to personal and professional interests.
Check out some specific pages like quotes and map of active tropical cyclones.
If it can be destroyed by the truth, it deserves to be destroyed by the truth. ⇨
“Always take sides. Neutrality helps the oppressor, never the victim. Silence encourages the tormentor never the tormented." ⇨
“I am so tired of waiting. Aren't you, for the world to become good and beautiful and kind? Let us take a knife and cut the world in two-- and see what worms are eating at the rind." ⇨
Recent Posts
24 Feb 2023
This previous post demonstrated a way to use generic website html source to create an inventory of web-accessible files using discovered URL links. A similar methodology can be applied to search through filesystem directory tree to establish a catalog for any matching filetypes.
Use glob to recursively globstar-match filepaths
#
In this case I use Python’s builtin glob and regular expression modules to list files and match extensions in the names. I used the os.path collection of utility methods to pull directory names from the full paths, but a more modern way would probably to use the builtin pathlib. Pandas is the only non-builtin package used, which could be removed if a DataFrame is not the desired output.
21 Jan 2023
Examples of accessible data inventories
Simple over Standards #
Frequently the biggest hurdle to sharing data is the pre-existing organization of files. While open standards based API access can smooth over the bumps inherent to locally stored heterogeneous collections of files, the amount of effort to setup an existing API webservice to share your data can be overwhelming. (Not to mention having to roll your own in an standards complant way.)
The NOAA example below is a great example of a simple solution that can be used as a template to provide easy programatic access to any dataset.
15 Dec 2022
`diff` Also Compares Directories
Short post so that I can remember this everytime I need to do something similar!
Using diff on more than individual files
#
The quick and dirty explaination is that the GNU/Linux diff command has an -r flag to recursively compare two folders. The command help indicates that it is shorthand for the full --recursive flag, which might be easier to remember.
diff --help
# ...
-r, --recursive recursively compare any subdirectories found
# ...
Example #
In the following example, “Only in” shows that particular files are only found in one of the folders. By default, matching files are not shown. If a file can be found in both folders and the two versions differ, the normal diff output is provided along with the modified times. All together these details provide a good summary of what a user might want to know when comparing two directories.