1
# Klex Git utilities
1
# Klex Git utilities
2
2
3
With the help of these tools, you can create a Git repository that relies
3
With the help of these tools, you can create a Git repository that relies
4
on Klex for data processing and AI services.
4
on Klex for data processing and AI services.
5
5
6
## Example use case and tutorial
6
## Example use case and tutorial
7
7
8
1. Create a new Git repository.
8
1. Create a new Git repository.
9
```bash
9
```bash
10
mkdir goodevil
10
mkdir goodevil
11
cd goodevil
11
cd goodevil
12
git init
12
git init
13
```
13
```
14
14
15
2. Create a dataset containing a book split into paragraphs.
15
2. Create a dataset containing a book split into paragraphs.
16
```bash
16
```bash
17
mkdir datasets
17
mkdir datasets
18
cd datasets
18
cd datasets
19
wget https://gutenberg.org/cache/epub/4363/pg4363.txt -O book.txt
19
wget https://gutenberg.org/cache/epub/4363/pg4363.txt -O book.txt
20
mkdir paragraphs
20
mkdir paragraphs
21
for i in {1..296}
21
for i in {1..296}
22
do
22
do
23
awk `/^$i\. /{f=1} f; /^$((i+1))\. / && f{exit}' book.txt > paragraphs/$i.txt
23
awk `/^$i\. /{f=1} f; /^$((i+1))\. / && f{exit}' book.txt > paragraphs/$i.txt
24
done
24
done
25
rm book.txt
25
rm book.txt
26
cd ..
26
cd ..
27
```
27
```
28
At this point, you should have a directory called data/paragraphs/ with 296
28
At this point, you should have a directory called data/paragraphs/ with 296
29
file in it, each containing a paragraph from the book. See 256.txt for an
29
file in it, each containing a paragraph from the book. See 256.txt for an
30
example of an edge case.
30
example of an edge case.
31
31
32
3. Create a JavaScript function that generates LLM prompts.
32
3. Create a JavaScript function that generates LLM prompts.
33
```bash
33
```bash
34
mkdir functions
34
mkdir functions
35
cat <<EOF > functions/tag.js
35
cat <<EOF > functions/tag.js
36
function(txt) {
36
function(txt) {
37
return "Produce a list of useful search words for the following" +
37
return "Produce a list of useful search words for the following" +
38
" paragraph of text. Present the list in lower case, separated by" +
38
" paragraph of text. Present the list in lower case, separated by" +
39
" commas, prefixed with the heading \"Tags: \", and ending in a" +
39
" commas, prefixed with the heading \"Tags: \", and ending in a" +
40
" period. For example, if the paragraph talks primarily about" +
40
" period. For example, if the paragraph talks primarily about" +
41
" the way teachers view their profession, output this line:\n" +
41
" the way teachers view their profession, output this line:\n" +
42
" Tags: teachers, teaching, profession.\n\n" +
42
" Tags: teachers, teaching, profession.\n\n" +
43
" Paragraphs: " + txt.replace(/^\d+[.] +/, "");
43
" Paragraphs: " + txt.replace(/^\d+[.] +/, "");
44
}
44
}
45
EOF
45
EOF
46
```
46
```
47
4. Create a JavaScript pipeline that tags each paragraph.
47
4. Create a JavaScript pipeline that tags each paragraph.
48
```bash
48
```bash
49
mkdir pipelines
49
mkdir pipelines
50
cat <<EOF > pipelines/tag.js
50
cat <<EOF > pipelines/tag.js
51
function(api) {
51
function(api) {
52
return api.dataset("paragraphs")
52
return api.dataset("paragraphs")
53
.map("tag")
53
.map("tag")
54
.map("llama3")
54
.map("llama3")
55
.name("tagged");
55
.name("tagged");
56
}
56
}
57
EOF
57
EOF
58
```
58
```
59
59
60
5. Create a Klex config file.
60
5. Create a Klex config file.
61
```bash
61
```bash
62
cat <<EOF > klex.json
62
cat <<EOF > klex.json
63
{
63
{
64
"project_name": "goodevil",
64
"project_name": "goodevil",
65
"datasets_dir": "datasets",
65
"datasets_dir": "datasets",
66
"functions_dir": "functions",
66
"functions_dir": "functions",
67
"pipelines_dir": "pipelines",
67
"pipelines_dir": "pipelines",
68
"klex_url": "https://oscarkilo.com/klex",
68
"klex_url": "https://oscarkilo.com/klex",
69
"api_key_file": "klex.api_key"
69
"api_key_file": "klex.api_key"
70
}
70
}
71
EOF
71
EOF
72
```
72
```
73
73
74
6. Get an API key [here](https://oscarkilo.com/login/profile") and save it to
74
6. Get an API key [here](https://oscarkilo.com/login/profile") and save it to
75
a file.
75
a file.
76
```bash
76
```bash
77
echo "YOUR_API_KEY" > klex.api_key
77
echo "YOUR_API_KEY" > klex.api_key
78
echo "klex.api_key" > .gitignore
78
echo "klex.api_key" > .gitignore
79
```
79
```
80
80
81
7. Add Klex Git utilities and hooks to your repository.
81
7. Add Klex Git utilities and hooks to your repository.
82
First, clone this repository to the ../klex-git directory:
82
First, clone this repository to the ../klex-git directory:
83
```bash
83
```bash
84
cd ..
84
cd ..
85
git clone https://code.oscarkilo.com/klex-git
85
git clone https://code.oscarkilo.com/klex-git
86
cd goodevil
86
cd goodevil
87
```
87
```
88
Then,
88
Then,
89
```bash
89
```bash
90
cat <<EOF > go.mod
90
cat <<EOF > go.mod
91
module goodevil
91
module goodevil
92
go 1.21
92
go 1.21
93
toolchain go1.21.3
93
toolchain go1.21.3
94
replace oscarkilo.com/klex-git => ../klex-git
94
replace oscarkilo.com/klex-git => ../klex-git
95
EOF
95
EOF
96
go get oscarkilo.com/klex-git
96
go get oscarkilo.com/klex-git
97
go run oscarkilo.com/klex-git/install
97
go run oscarkilo.com/klex-git/install
98
```
98
```
99
8. Commit your changes.
99
8. Commit your changes.
100
```bash
100
```bash
101
git add .
101
git add .
102
git commit -m "First!"
102
git commit -m "First!"
103
```
103
```
104
104
105
## Available models and the features they support
105
## Available models and the features they support
106
106
107
| LLM Name | Image input | System prompts | Tool use |
107
| LLM Name | Image input | System prompts | Tool use |
108
|-------------------|-------------|----------------|----------|
108
|-------------------|-------------|----------------|----------|
109
| Claude 3 Opus | works | works | TODO |
109
| Claude 3 Opus | works | works | TODO |
110
| Claude 3.5 Sonnet | works | works | TODO |
110
| Claude 3.5 Sonnet | works | works | TODO |
111
| Phi 3.5 | works | works | TODO |
111
| Phi 3.5 | works | works | TODO |
112
| GPT4 | didn't test | didn't test | TODO |
112
| GPT4 | didn't test | didn't test | TODO |
113
| GPT4o | works | works | TODO |
113
| GPT4o | works | works | TODO |
114
| o1-preview | works | works | TODO |
114
| o1-preview | works | works | TODO |
115
| Llama 3.2 | didn't test | no | TODO |
115
| Llama 3.2 | didn't test | no | TODO |