rwkv

package module
v0.0.0-...-661e7ae Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 28, 2024 License: MIT Imports: 14 Imported by: 1

README

AI with RWKV

Go Reference go-rwkv.cpp

gowrkv.go is a wrapper around rwkv-cpp, which is an adaption of ggml.cpp.

Features

rkwv.cpp is generally faster, due to keeping the intermediate state of the model, so the entire prompt doesn't have to be reprocessed every time. For more details, see rwkv-cpp.

Also, the available models for rwkv.cpp are fully open-source, unlike llama. You can use these models commercially, and you can modify them to your heart's content.

Training may also be faster, I haven't had a chance to try that yet.

Installation

Installation is currently complex. go-rkwv.cpp does not work with go get yet (patches very welcome). You will need go, a c++ compiler(clang on Mac), and cmake.

Download

You must clone this repo /recursively/, as it contains submodules.

    git clone --recursive https://github.com/donomii/go-rwkv.cpp
Building

There is a build script, build.sh, which will build the c++ library and the go wrapper. Please file bug reports if it doesn't work for you.

    ./build-mac.sh

There is now an alternate build, which builds statically thanks to a makefile provided by @mudler.

    make example/ai
Download models

The download script will download some models, and convert them to the correct format.

    cd aimodels
    sh downloadconvert.sh
Install

go-rwkv.cpp currently builds against the dynamic library librwkv.dylib. This is not ideal, but it works for now. You will need to copy this library to a location where the system linker can find it. On Mac, this is /usr/local/lib.

    cp librwkv.dylib /usr/local/lib
    export DYLD_LIBRARY_PATH=/Users/donomii/git/go-rwkv.cpp/rwkv.cpp/

If you don't want to install it globally, you can set the DYLD_LIBRARY_PATH environment variable to the directory containing librwkv.dylib.

Use

See the example/ directory for a full working chat program. The following is a minimal example.

    package main

    import (
        "fmt"
        "github.com/donomii/go-rwkv.cpp"
    )

    func main() {
        model := LoadFiles("aimodels/small.bin", "rwkv.cpp/rwkv/20B_tokenizer.json", 8)
    model.ProcessInput("You are a chatbot that is very good at chatting.  blah blah blah")
    response := model.Generate(100, "\n")
    fmt.Println(response)

    }

You must use the tokenizer file from rwkv.cpp. go-rwkv contains a re-implementation of the tokenizer, but it is a minimal implementation that contains just enough code to work with rwkv (and there are probably bugs in it).

Packaging

To ship a working program that includes this AI, you will need to include the following files:

  • librwkv.dylib
  • the model file (e.g. RWKV-4-Raven-1B5-v9-Eng99%-Other1%-20230411-ctx4096_quant4.bin)
  • the tokenizer file (i.e. 20B_tokenizer.json)

If you don't install librwkv.dylib globally, you will need to set the DYLD_LIBRARY_PATH environment variable to the directory containing librwkv.dylib.

License

This program is licensed under the MIT license. See LICENSE for details.

As far as I am aware, the Raven models are also open source.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func BPETokenizeWithMerges

func BPETokenizeWithMerges(tokenizer Tokenizer, text string) []string

func ByteLevelPreTokenize

func ByteLevelPreTokenize(input string, addPrefixSpace bool) string

func DeTokenise

func DeTokenise(tk Tokenizer, tokens []int) string

func GetSystemInfoString

func GetSystemInfoString() string

func QuantizeModelFile

func QuantizeModelFile(modelFilePathIn, modelFilePathOut string, formatName string) (bool, error)

Types

type AddedToken

type AddedToken struct {
	Id         int    `json:"id"`
	Special    bool   `json:"special"`
	Content    string `json:"content"`
	SingleWord bool   `json:"single_word"`
	Lstrip     bool   `json:"lstrip"`
	Rstrip     bool   `json:"rstrip"`
	Normalized bool   `json:"normalized"`
}

type Context

type Context struct {
	// contains filtered or unexported fields
}

func InitFromFile

func InitFromFile(modelFilePath string, nThreads uint32) (*Context, error)

func (*Context) Eval

func (ctx *Context) Eval(token int32, stateIn []float32) ([]float32, []float32, bool, error)

func (*Context) Free

func (ctx *Context) Free()

func (*Context) GetLogitsBufferElementCount

func (ctx *Context) GetLogitsBufferElementCount() uint32

func (*Context) GetStateBufferElementCount

func (ctx *Context) GetStateBufferElementCount() uint32

type Decoder

type Decoder struct {
	Type           string `json:"type"`
	AddPrefixSpace bool   `json:"add_prefix_space"`
	TrimOffsets    bool   `json:"trim_offsets"`
}

type Model

type Model struct {
	Type                    string         `json:"type"`
	Dropout                 float32        `json:"dropout"`
	UnkToken                string         `json:"unk_token"`
	ContinuingSubwordPrefix string         `json:"continuing_subword_prefix"`
	EndOfWordSuffix         string         `json:"end_of_word_suffix"`
	FuseUnk                 bool           `json:"fuse_unk"`
	Vocab                   map[string]int `json:"vocab"`
	Merges                  []string       `json:"merges"`
}

type Normalizer

type Normalizer struct {
	Type string `json:"type"`
}

type PostProcessor

type PostProcessor struct {
	Type           string `json:"type"`
	AddPrefixSpace bool   `json:"add_prefix_space"`
	TrimOffsets    bool   `json:"trim_offsets"`
}

type PreTokenizer

type PreTokenizer struct {
	Type           string `json:"type"`
	AddPrefixSpace bool   `json:"add_prefix_space"`
	TrimOffsets    bool   `json:"trim_offsets"`
}

type RwkvState

type RwkvState struct {
	// The context
	Context   *Context
	State     []float32
	Logits    []float32
	Tokenizer *Tokenizer
}

func LoadFiles

func LoadFiles(modelFile, tokenFile string, threads uint32) *RwkvState

LoadFiles loads the model and tokenizer from the given files. modelFile is the path to the model file. This must be in ggml format. See the aimodels/ directory for examples. tokenFile is the path to the tokenizer file. This must be in json format. At the moment, only the 20B_tokenizer.json file from rwkv.cpp is supported.

func (*RwkvState) GenerateResponse

func (r *RwkvState) GenerateResponse(maxTokens int, stopString string, temperature float32, top_p float32, tokenCallback func(s string) bool) string

Generate a response from the current state. The state will be changed by this function, in the process of generating the response. maxTokens is the maximum number of tokens to generate stopString is a string to stop at. If the response contains this string, the response will be truncated at this point.

func (*RwkvState) PredictNextToken

func (r *RwkvState) PredictNextToken(temperature float32, top_p float32) string

Predict the next token from the current state. State will not be changed by this function.

func (*RwkvState) ProcessInput

func (r *RwkvState) ProcessInput(input string) error

ProcessInput processes the input string, updating the state of the model.

func (*RwkvState) Reset

func (r *RwkvState) Reset()

Reset the state of the model. This is useful if you want to start a new conversation. After resetting, you can't generate a response until you process some input.

type Token

type Token struct {
	ID    int
	Value string
	Start int
	End   int
}

func ByteLevelDecode

func ByteLevelDecode(in []string, tk Tokenizer) []Token

func Tokenize

func Tokenize(input string, pipelineConfig Tokenizer) ([]Token, error)

type Tokenizer

type Tokenizer struct {
	AddedTokens   []AddedToken  `json:"added_tokens"`
	Normalizer    Normalizer    `json:"normalizer"`
	PreTokenizer  PreTokenizer  `json:"pre_tokenizer"`
	PostProcessor PostProcessor `json:"post_processor"`
	Decoder       Decoder       `json:"decoder"`
	Model         Model         `json:"model"`
}

func LoadTokeniser

func LoadTokeniser(file string) (Tokenizer, error)

func (Tokenizer) Encode

func (t Tokenizer) Encode(text string) ([]Token, error)

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL