Skip to content

Commit

Permalink
Initial version
Browse files Browse the repository at this point in the history
  • Loading branch information
driver-deploy-2 committed May 2, 2024
0 parents commit cefcaa3
Show file tree
Hide file tree
Showing 81 changed files with 1,170 additions and 0 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
/target
45 changes: 45 additions & 0 deletions .vscode/launch.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
{
// Use IntelliSense to learn about possible attributes.
// Hover to view descriptions of existing attributes.
// For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
"version": "0.2.0",
"configurations": [
{
"type": "lldb",
"request": "launch",
"name": "Debug executable 'doc-parser'",
"cargo": {
"args": [
"build",
"--bin=doc-parser",
"--package=doc-parser"
],
"filter": {
"name": "doc-parser",
"kind": "bin"
}
},
"args": [],
"cwd": "${workspaceFolder}"
},
{
"type": "lldb",
"request": "launch",
"name": "Debug unit tests in executable 'doc-parser'",
"cargo": {
"args": [
"test",
"--no-run",
"--bin=doc-parser",
"--package=doc-parser"
],
"filter": {
"name": "doc-parser",
"kind": "bin"
}
},
"args": [],
"cwd": "${workspaceFolder}"
}
]
}
9 changes: 9 additions & 0 deletions .vscode/settings.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
{
"spellright.language": [
"nl"
],
"spellright.documentTypes": [
"latex",
"plaintext"
]
}
248 changes: 248 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

11 changes: 11 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
[package]
name = "doc-parser"
version = "0.1.0"
edition = "2021"

[dependencies]
docx-rust = "0.1.7"
# docx-rust = { git="https://github.com/erikvullings/docx-rs.git" }



23 changes: 23 additions & 0 deletions convert_to_markdown.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
#!/bin/bash

# Specify the input folder containing the .docx files
input_folder="./test"

# Specify the media folder where media files will be extracted
media_folder="${input_folder}"

# Loop through all .docx files in the input folder
for file in "$input_folder"/*.docx; do
# Check if the file is a regular file
if [ -f "$file" ]; then
# Extract the file name without extension
filename=$(basename -- "$file")
filename_no_ext="${filename%.*}"

# Convert the .docx file to Markdown using Pandoc. See also https://stackoverflow.com/a/74654058/319711 for heading style
pandoc -s "$file" --wrap=none --reference-links --atx-headers --extract-media="$input_folder" -t markdown -o "${input_folder}/${filename_no_ext}.md"

echo "Converted $file to Markdown."
fi
done

Binary file added example.docx
Binary file not shown.
Loading

0 comments on commit cefcaa3

Please sign in to comment.