Private Library Catalogue in Obsidian

Go anywhere you wish, talk to everyone. Ask any questions; you will be given answers. When you want to learn, you will be taught. Use the library. Open any book.
«Unseen Academicals» by Terry Pratchett

One problem with having books digitally is that you do not really see them. They are virtual, digital, not physical artifacts which can catch your attention. Sure, when do you actually look at your bookshelf? It’s stationary, doesn’t change, so it fades into the background quickly. But still.

So, given that the books I own are almost all digital (3650, a few duplicates in different versions), I invested an evening to create a library catalogue of these books. A rather simple one, but one that should work.

Starting point was a folder with 0-9, A-Z subfolders containing PDF files. The books are named in the author_…year format, followed by the title, for example, «wessel_2012 Organizing Creativity.pdf». I used an R script (see below) to go through these folders and extract the relevant information (file names, year, title). Additionally, a package extracted the cover image and converted it to jpg.

As I have a page in my main Obsidian vault that has links to the PDF files but sorted under headers (e.g., «Cats and Dogs»), I used that list to assign a category to each book.

Next up, the script did create individual entries for each book, containing backlinks to the main Obsidian page, a category page (see below), Title, CityKey (author_…year), Year, link to the PDF, Section (Category), Priority (default: none), Link to NotesFiles (if I have already read the book and made notes in my main Obsidian Vault, it creates a link to it), a QuickNote field for notes in that Vault, the Cover-Image, and placeholders for APA and BibTeX citation. In order to use the data in Obsidian’s Dataview (see below), I used two «:».

As each book had a category, R created section pages. E.g., a «Cats and Dog» section that contains all respective books. Given that the list can change and updating it manually would suck, I used Dataview (community plugin). Someone found a way to show multiple entries in the same cell — here: link to the book entry (Card), link to the PDF (PDF), and (if it exists) link to the page in my main Obsidian Vault. So:

```dataview
TABLE WITHOUT ID
Cover, "Card: " + file.link + "<br>PDF:" + PDF + "<br>Notes Entry: " + NotesFile as "Info"
WHERE Section = "Cats and Dogs"
```

provides you with a nice list of the books:

The sections pages themself are linked on the main Obsidian page:

If I add new books, I can either use a modified R Script (would still have to write it) or simply use an Obsidian Template:

So yeah, overall it’s quite nice. Including the direct link to the files:

Only downside so far is that the library vault continues to re-index the pages when I open it. Not sure why. It should only do it once, but somehow it does not seem to store the information.

But besides this issue, a nice way to get a structured overview over one’s library. You can use the cards to set the reading priority, or just browse through different sections.

BTW, the R Code is the following (however, I did some stuff outside of R, among others some changes with BBEdit that did not warrant exporting all book index files again, and I had to go over the cover names as the Dataview plugin cannot handle diacritics):

booksNF <- tibble(fullPath = list.files("PATHINFORMATION/Non-Fiction/books", full.names = TRUE, recursive = TRUE, pattern = ".pdf"))
booksNF <- booksNF %>% rowwise() %>% mutate(
nameOnly = str_split(fullPath, "/")[[1]][[length(str_split(fullPath, "/")[[1]])]],
nameOnly = str_replace(nameOnly, ".pdf", ""),
authorYear = str_split(nameOnly, " ")[[1]][[1]],
titleOnly = str_replace(nameOnly, authorYear, ""),
titleOnly = trimws(titleOnly),
titleOnly = str_replace(titleOnly, ".pdf", "")
)
write_csv(booksNF, "OBSIDIANPATH/booksNF.csv")

existingNF <- tibble(nameOnly = list.files("PATHINFORMATION/Sources/Books - Non-Fiction Sources", full.names = FALSE, recursive = TRUE, pattern = ".md"))
existingNF <- existingNF %>% mutate(
nameOnly = str_replace(nameOnly, ".md", ""),
inObsidian = TRUE
)

booksNF2 <- full_join(booksNF, existingNF, keep = TRUE, join_by("nameOnly"))
sum(booksNF2$inObsidian, na.rm=TRUE)
workData <- booksNF2 %>% filter(!is.na(authorYear))

rm(booksNF)
rm(booksNF2)
rm(existingNF)

workData <- workData %>% rowwise() %>% mutate(selectPath = str_replace(fullPath, "SHORTPATHINFORMATION", "~"))
write_csv(workData, "OBSIDIANPATH/workData_booksNF.csv")
workData$coverCreated <- FALSE

library(pdftools)
library("beepr")
for(i in 1:nrow(workData)) {
fileLocation <- workData[[i, "selectPath"]]
author_year_output <- paste0("Covers/Cover ", workData[[i, "nameOnly.x"]], ".jpg")
pdf_convert(pdf = fileLocation, format = "jpeg", pages = 1, filenames = c(author_year_output), dpi = 72)
workData[[i, "coverCreated"]] <- TRUE
if(i%%10 == 0) {
print("waiting ...")
Sys.sleep(sample(3))
}
}
beep(4)

workData <- workData %>% rowwise() %>% mutate(
coverPath = paste0("Covers/", str_to_upper(substr(authorYear, 1, 1)), "/Cover ", nameOnly.x, ".jpg")
)

write_csv(workData, "OBSIDIANPATH/workData_booksNF.csv")

workData <- workData %>% rowwise() %>% mutate(
citePath = str_replace_all(nameOnly.x, " ", "%20")
)

paste0("[", workData[[17, "nameOnly.x"]],"](obsidian://open?vault=MAINVALULT&file=Sources%2FBooks%20-%20Non-Fiction%20Sources%2F", str_replace(workData[[17, "citePath"]], ".pdf", ""), ")")

workData <- workData %>% mutate(
year = str_split(authorYear, "_")[[1]][[length(str_split(authorYear, "_")[[1]])]]
)

for(i in 1:nrow(workData)) {
workData[[i, "coverName"]] <- paste0("Cover ", workData[[i, "nameOnly.x"]], ".jpg")
}

catText <- read_lines("categories.txt")
catData <- tribble(~name, ~category)
catName <- ""
for(i in 1:length(catText)) {
curText <- catText[[i]] if(nchar(curText > 0)) {
if(str_detect(curText, "# ")) {
catName <- str_replace(curText, "# ", "")
} else {
catData <- catData %>% add_row(
name = curText,
category = catName
)
}
}
}
catData <- catData %>% filter(nchar(name)>0)

workData <- workData %>% rowwise() %>% mutate(
name = paste0(nameOnly.x, ".pdf")
)

workData <- workData %>% left_join(catData, by = "name", keep = TRUE)
workData <- workData %>% rowwise() %>% mutate(
category = str_replace_all(category, "#", "")
)

write_csv(workData, "OBSIDIANPATH/workData_booksNF.csv")

workData <- read_csv("OBSIDIANPATH/workData_booksNF.csv")

makeSane <- function(x) {
x <- str_to_lower(x)
x <- str_replace_all(x, " ", "_")
x <- str_replace_all(x, "'", "")
x <- str_replace_all(x, "ä", "ae")
x <- str_replace_all(x, "ö", "oe")
x <- str_replace_all(x, "ü", "ue")
x <- str_replace_all(x, "ß", "ss")
x <- stringi::stri_trans_general(x, "Latin-ASCII")
return(x)
}

graphicsListData <- tibble(graphic = list.files("OBSIDIANPATH/LibraryCatalog/1 Media/Covers", full.names = FALSE, recursive = TRUE, pattern = ".jpg"))
graphicsListData <- graphicsListData %>% rowwise() %>% mutate(
filename = str_split(graphic, "/")[[1]][[length(str_split(graphic, "/")[[1]])]],
filenameSane = makeSane(filename),
pathLetter = str_replace(graphic, filename, "")
)

for(i in 1:nrow(graphicsListData)) {
file.rename(paste0("OBSIDIANPATH/LibraryCatalog/1 Media/Covers/", graphicsListData[[i, "graphic"]]), paste0("OBSIDIANPATH/LibraryCatalog/1 Media/Covers/", graphicsListData[[i, "pathLetter"]], graphicsListData[[i, "filenameSane"]]))
}

for(i in 1:nrow(workData)) {
noteRef = "none"
if(!is.na(workData[[i, "inObsidian"]])) {
noteRef = paste0("[", workData[[i, "nameOnly.x"]], "](obsidian://open?vault=MAINVALULT&file=Sources%2FBooks%20-%20Non-Fiction%20Sources%2F", str_replace_all(workData[[i, "nameOnly.x"]], " ", "%20"), ")")
}
outputline = paste0("[[Library|Front Desk]] | [[", workData[[i, "category"]], "]]\n\n*Title*:: **", workData[[i, "titleOnly"]], "**\n- *CiteKey*:: ", workData[[i, "authorYear"]], "\n- *Year*:: ", workData[[i, "year"]], "\n- *PDF*:: [", workData[[i, "nameOnly.x"]], "](file://", str_replace_all(workData[[i, "fullPath"]], " ", "%20"), ")\n- *Section*:: ", workData[[i, "category"]], "\n- *Priority*: none\n---\n- *NotesFile*:: ", noteRef, "\n- *QuickNote*:: none\n---\n*Cover*:: ![[", makeSane(workData[[i, "coverName"]]), "|200]]\n\n---\n*APA*:: \n\n```tex\nBibTeX\n```\n")
write_lines(outputline, paste0("Pages/", substr(workData[[i, "authorYear"]], 1, 1), "/", workData[[i, "nameOnly.x"]], ".md"))
}

catList <- unique(workData$category)
for(i in 1:length(catList)) {
outputLines <- paste0("Section ", catList[[i]], "\n=====================\n[[Library|Front Desk]]\n\n```dataview\nTABLE WITHOUT ID\n Cover, file.link, PDF, NotesFile\n WHERE Section = \"", catList[[i]], "\"\n```\n\n[[#Section ", catList[[i]], "|back to top]]")
write_lines(outputLines, paste0("Sections/", catList[[i]], ".md"))
}

catList <- sort(catList)
outlines <- "## Sections"
for(i in 1:length(catList)) {
outlines <- c(outlines, paste0("- [[", catList[[i]], "]]"))
}
write_lines(outlines, "sectionpart.md")

So, it’s without any warranty, as always.