Commit 41e2721e authored by gijs's avatar gijs
Browse files

Final Version of the crawler. Extended the readme, added a license

parent 3cf1b435
All the files in this folder are licensed under the Free Art License v1.3
http://artlibre.org/licence/lal/en/
Except for the file vgc.png which is owned by 'Vlaamse Gemeenschapscomissie'
and protected by their copyright. Please refer to http://www.vgc.be/ for more
information on the use of their logo.
==install==
=Algolit Catalog=
The algolit catalog is generated using a workflow where the content is retreived
from the algolit wiki. A simple scraper downloads the a page from the wiki and
follows all the links to the lemmas.
The CSS is downloaded from an etherpad.
This html is opened in a browser and 'printed' to a pdf file. Because of the use of
css-columns and a bug in Firefox' (the pdf is a single page) column-fill property
currently only chrome and chromium can be used.
==Install requirements==
to install the requirements:
> pip install -r requirements.txt
==generate==
To generate the catalog's html file:
==Generate==
To generate the html files for the French and English catalog:
> python makeCatalog.py
==style.css==
......@@ -13,4 +24,13 @@ To download the styles from the pad
> bash loadstyles.sh
==read & print==
Open the catalog.html in the browser
Open the catalog.html in the browser, print the file to a pdf file.
==Pagenumbers (optional)==
Blink doesn't support pagenumbers yet so they are generated using a silly hack: pagenumbers.html.
This file is 'printed' to a seperate pdf file.
Using pdftk the two pdf-files are laid on top of each other:
pdftk catalog-file.pdf multistamp pagenumbers-file.pdf output catalog-with-pagenumbers.pdf
Oddly Chrom(e/ium) all of a sudden doesn't print 'transparent' pdf anymore so for this catalog
[OSPKitPDF](http://osp.kitchen/tools/ospkit/) was used to generate a proper transparent file for the pagenumbers.
......@@ -19,14 +19,6 @@ template = u"""<!DOCTYPE html>
<link rel="stylesheet" href="style.css">
</head>
<body>
<svg class="defs-only">
<filter id="duotone" color-interpolation-filters="sRGB" x="0" y="0" height="100%" width="100%">
<feColorMatrix type="matrix" values=" 1 0 0 0 0
1 0 0 0 0
1 0 0 0 0
0 0 0 1 0" />
</filter>
</svg>
</body>
</html>"""
......
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<title>Algolit Catalog</title>
<link rel="stylesheet" href="style.pagenumbers.css">
</head>
<body>
<section class="page"></section>
<section class="page"></section>
<section class="page"></section>
<section class="page"></section>
<section class="page"></section>
<section class="page"></section>
<section class="page"></section>
<section class="page"></section>
<section class="page"></section>
<section class="page"></section>
<section class="page"></section>
<section class="page"></section>
<section class="page"></section>
<section class="page"></section>
<section class="page"></section>
<section class="page"></section>
<section class="page"></section>
<section class="page"></section>
<section class="page"></section>
<section class="page">
<img id="vgc" src="vgc.png" />
</section>
</body>
</html>
\ No newline at end of file
@charset "utf-8";
@page {
size: 210mm 297mm;
margin: 0mm;
}
@page:right {
margin-left: 10mm;
margin-right: 6mm;
}
@page:left {
margin-left: 6mm;
margin-right: 10mm;
}
@font-face{
font-family:Fantasque;
src:url(fonts/fantasque/FantasqueSansMono-Regular.ttf);
font-weight:normal;
font-style:normal;
}
@font-face{
font-family:Fantasque;
src:url(fonts/fantasque/FantasqueSansMono-Italic.ttf);
font-weight:normal;
font-style:italic;
}
@font-face{
font-family:Fantasque;
src:url(fonts/fantasque/FantasqueSansMono-Bold.ttf);
font-weight:bold;
font-style:normal;
}
@font-face{
font-family:Fantasque;
src:url(fonts/fantasque/FantasqueSansMono-BoldItalic.ttf);
font-weight:bold;
font-style:italic;
}
html, body, * {
background: rgba(255,255,255,0);
}
body {
font-family:Fantasque;
font-size:12px;
line-height:1.4;
color: black;
counter-reset: page 0;
}
.page {
padding-top: 270mm;
counter-increment: page;
break-after: page;
position: relative;
}
.page:after {
background: transparent;
display: block;
text-align: center;
content: counter(page);
}
#vgc {
width: 15mm;
position: absolute;
bottom: 10mm;
right: 10mm;
}
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment