This repository hosts Python automation utilities and user experience (UX) enhancements designed to help public and academic libraries in developing nations streamline their digital archiving and remote access systems.
Automates the conversion and ingestion of large spreadsheets (Excel/CSV) into DSpace-compatible metadata structures.
import pandas as pd
def prep_dspace_metadata(csv_file):
# Read messy local spreadsheet
df = pd.read_csv(csv_file)
# Map fields to standard Dublin Core (dc) format
dspace_df = pd.DataFrame({
'dc.title': df['Book Title'],
'dc.contributor.author': df['Author Name'],
'dc.date.issued': df['Year'],
'dc.subject': df['Keywords']
})
dspace_df.to_csv('dspace_bundle.csv', index=False)
print("Successfully generated DSpace ingestion bundle!")
# Run the upload prep
prep_dspace_metadata('local_library_catalog.csv')Cleans up file names, removes corrupt characters, and structures book/manuscript folders before they are uploaded to institutional repositories.
import os
import re
def clean_library_filenames(directory_path):
for filename in os.listdir(directory_path):
# Remove special characters and replace spaces with clean underscores
clean_name = re.sub(r'[^a-zA-Z0-9._-]', '_', filename).strip()
old_file = os.path.join(directory_path, filename)
new_file = os.path.join(directory_path, clean_name)
os.rename(old_file, new_file)
print("Folder directory perfectly sanitized for DSpace upload!")Python helpers to parse and format e-journal subscription metadata feeds, ensuring clean indexing inside the MyLoFT app dashboard.
def filter_myloft_feed(ejournal_data):
optimized_records = []
for record in ejournal_data:
# Standardize discovery keywords for low-bandwidth mobile search
record['search_tags'] = [tag.lower().strip() for tag in record['tags']]
optimized_records.append(record)
return optimized_recordsAutomatically generates proxy-wrapped access URLs so students can smoothly log in to electronic databases from outside the physical library building.
def generate_remotexs_url(target_db_url, proxy_domain="remotexs.xyz.edu"):
# Wraps an academic database URL with your library's RemoteXS proxy gateway
secure_remote_link = f"https://{proxy_domain}/proxy?url={target_db_url}"
return secure_remote_link
print(generate_remotexs_url("https://jstor.org"))- Clone this repository to your library computer:
git clone https://github.com
- Install the necessary Python packages:
pip install pandas
Developing and testing automation workflows for platforms like DSpace, MyLoFT, and RemoteXS takes massive amounts of development time. Keeping these tools free ensures underfunded institutions don't have to hire expensive software consultants.
Please support this open-source journey by clicking the Sponsor button at the top right of this page!
Access our specialized medical library study guides and automation utilities:
- 📘 Wolters Kluwer Health Library Master Guide - Learn how to authenticate, bypass paywalls, and extract medical textbooks for your university study blocks.
- 🐍 Automated HTML Documentation Generator Script - Run this native Python automation script on your terminal to instantly build a stylized, offline-friendly HTML portal of your study guide material.
Students and librarians can use these simple terminal commands to run the script locally and generate the offline HTML guide.
Ensure you have Python installed on your computer. You can check by running:
python --version- Clone this repository or download the
generate_guide.pyfile to your computer. - Open your terminal (Command Prompt on Windows, Terminal on Mac/Linux) and navigate to the folder containing the file.
- Run the script using the following command:
python generate_guide.pyOnce completed, you will see a success message in your terminal:
✅ Generated 'wolters_kluwer_university_guide.html' successfully.
You can now double-click the newly created wolters_kluwer_university_guide.html file to open it instantly in any web browser for offline viewing!