Showing posts with label arm64. Show all posts
Showing posts with label arm64. Show all posts

Monday, May 17, 2021

How to compile Quantlib-Python for Raspberry Pi 4B arm32 and arm64

Raspberry Pi has default gcc-8 and Python 3.7 for its 32 bit / 64 bit buster image. And compiling QuantLib-Python on this machine could have out of memeory error. Cross compiling on docker might have different python version which is not compatible. The trick to compile on Raspberry Pi is to setup swap say 2G and 4G Ram and turn off debug -g flag when compiling as Python package.
Shell script for building arm32 version   Select all
# install necessary packages for building sudo apt update sudo apt install -y build-essential wget libbz2-dev libboost-test1.67.0 libboost-test-dev # Get QuantLib-1.22 and build static library cd ${HOME} wget https://github.com/lballabio/QuantLib/releases/download/QuantLib-v1.22/QuantLib-1.22.tar.gz tar xzf QuantLib-1.22.tar.gz cd QuantLib-1.22/ ./configure --prefix=/usr --disable-shared CXXFLAGS=-O3 make -j 4 && make install sudo ldconfig # Setup and enable swap and check it for at least 2GB. sudo dphys-swapfile setup sudo dphys-swapfile swapon free -mh sudo apt install -y python3 python3-pip python-dev libgomp1 # Get QuantLib-SWIG-1.22 and compile it cd ${HOME} wget --no-check-certificate https://github.com/lballabio/QuantLib-SWIG/releases/download/QuantLib-SWIG-v1.22/QuantLib-SWIG-${quantlib_swig_version}.tar.gz tar xfz QuantLib-SWIG-1.22.tar.gz cd QuantLib-SWIG-1.22/ ./configure CXXFLAGS="-O2 --param ggc-min-expand=1 --param ggc-min-heapsize=32768 -Wno-deprecated-declarations -Wno-misleading-indentation" PYTHON=/usr/bin/python3 # manual compile it and remove the -g flag cd Python/ mkdir -p build/temp.linux-armv7l-3.7/QuantLib export CXX="echo gcc"; python3 setup.py bdist_wheel g++ -fwrapv -O2 -Wall -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DNDEBUG -I/usr/include/python3.7m -I/usr/include -c QuantLib/quantlib_wrap.cpp -o build/temp.linux-armv7l-3.7/QuantLib/quantlib_wrap.o -Wno-unused --param ggc-min-expand=1 --param ggc-min-heapsize=32768 -Wno-deprecated-declarations -Wno-misleading-indentation mkdir -p build/lib.linux-armv7l-3.7/QuantLib/ g++ -shared -Wl,-z,relro -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 build/temp.linux-armv7l-3.7/QuantLib/quantlib_wrap.o -lQuantLib -o build/lib.linux-armv7l-3.7/QuantLib/_QuantLib.cpython-37m-arm-linux-gnueabihf.so # create wheel file python3 setup.py bdist_wheel # Upgrade PIP and install the wheel file /usr/bin/python3 -m pip install --upgrade pip pip3 install dist/QuantLib-1.22-cp37-cp37m-linux_armv7l.whl # Or alternatively install as site-package sudo python3 setup.py install # Test examples after installation pip3 install pandas python3 examples/bonds.py . . . .


Compiling for Rapberry Pi arm64 is very similar but has to add -fPIC flag for the QuantLib when building static library
Shell script for building arm64 version   Select all
# install necessary packages for building sudo apt update sudo apt install -y build-essential wget libbz2-dev sudo apt install -y libboost-test1.67.0 libboost-test-dev cd ${HOME} wget https://github.com/lballabio/QuantLib/releases/download/1.22/QuantLib-1.22.tar.gz tar xzf QuantLib-1.22.tar.gz cd QuantLib-1.22/ # enable -fPIC flag for building static library ./configure --prefix=/usr --disable-shared CXXFLAGS="-O3 -fPIC" make -j 4 && make install sudo ldconfig # If Raspbeery Pi has 8GB Ram, no need to setup and enable swap sudo apt install -y python3 python3-pip python-dev libgomp1 # Get QuantLib-SWIG-1.22 and compile it cd {HOME} wget https://github.com/lballabio/QuantLib-SWIG/releases/download/QuantLib-SWIG-v1.22/QuantLib-SWIG-1.22.tar.gz tar xzf QuantLib-SWIG-1.22.tar.gz cd QuantLib-SWIG-1.22/ cd Python/ ./configure CXXFLAGS="--param ggc-min-expand=1 --param ggc-min-heapsize=32768 -fPIC -Wno-deprecated-declarations -Wno-misleading-indentation" PYTHON=/usr/bin/python3 # manual compile it and remove the -g flag cd Python/ mkdir -p build/temp.linux-aarch64-3.7/QuantLib/ g++ -fwrapv -O2 -Wall -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/usr/include/python3.7m -I/usr/include -c QuantLib/quantlib_wrap.cpp -o build/temp.linux-aarch64-3.7/QuantLib/quantlib_wrap.o -Wno-unused --param ggc-min-expand=1 --param ggc-min-heapsize=32768 -fno-strict-aliasing -Wno-unused -Wno-uninitialized -Wno-sign-compare -Wno-write-strings -Wno-deprecated-declarations -Wno-misleading-indentation mkdir -p build/lib.linux-aarch64-3.7/QuantLib/ g++ -shared -Wl,-z,relro -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 build/temp.linux-aarch64-3.7/QuantLib/quantlib_wrap.o -lQuantLib -o build/lib.linux-aarch64-3.7/QuantLib/_QuantLib.cpython-37m-aarch64-linux-gnu.so # create wheel file python3 setup.py bdist_wheel # Upgrade PIP and install the wheel file /usr/bin/python3 -m pip install --upgrade pip pip3 install dist/QuantLib-1.22-cp37-cp37m-linux_aarch64.whl # Or alternatively install as site-package sudo python3 setup.py install # Test examples after installation pip3 install pandas python3 examples/bonds.py


File Download QuantLib-1.22-cp37-cp37m-linux_armv7l.whl https://mega.nz/file/mtJSxZTT#fzDDHw0AIqz-2LIspBGNZLoyW4_MT9qjft_b-ITTA8w

File Download QuantLib-1.22-cp37-cp37m-linux_aarch64.whl https://mega.nz/file/WlAEXJCZ#UKFnlTrfQfRNzFW-OJbXHLFIHwzCw_189HvMa_xU4Oo

Thursday, April 15, 2021

HelloWorld Assembler Code for x86_64, arm64 and for linux or macOS

(1) Following the previous post, this post demo the assembler code for command line program HelloWorld for x86_64, arm64 and for linux or macOS.
HelloWorld.S   Select all
// // Assembler program to print "Hello World!" // to stdout. For amr64, x86_64, linux and macOS // #define STDIN 0 // standard input device #define STDOUT 1 // standard output device #ifdef __APPLE__ #define SYS_read 0x2000003 // system call to read input macOS #define SYS_write 0x2000004 // system call to write message macOS #define SYS_exit 0x2000001 // system call to terminate program macOS #define SVC_write 4 // SVC write arm64 macOS #define SVC_exit 1 // SVC exit arm64 macOS #endif #ifdef __linux__ #define SYS_read 0 // system call to read input #define SYS_write 1 // system call to write message #define SYS_exit 60 // system call to terminate program #define SVC_write 64 // SVC write arm64 linux #define SVC_exit 93 // SVC exit arm64 linux #endif #define EXIT_OK 0 // OK exit status .globl _start // Provide program starting address to linker #ifdef __APPLE__ .align 4 #endif .text _start: #if defined __arm64__ || defined __ARM_ARCH_ISA_A64 mov X0, #STDOUT // 1 = StdOut #ifdef __linux__ ldr X1, =helloworld // string to print mov X8, #SVC_write // linux write system call #endif #ifdef __APPLE__ // adr X1, helloworld // string to print //(adr calculates an address from the PC plus an offset, but for local) adrp X1, helloworld@PAGE // adrp can be used to access relative address of 4GB range add X1, X1, helloworld@PAGEOFF // string to print mov X16, #SVC_write // linux write system call #endif ldr X2, =len // length of our string svc #0 // Call linux to output the string // Setup the parameters to exit the program // and then call Linux to do it. mov X0, #0 // Use 0 return code #ifdef __linux__ mov X8, #SVC_exit // Service command code 93 terminates this program #endif #ifdef __APPLE__ mov X16, #1 // Service command terminates this program #endif svc #0 // Call linux to terminate the program #endif #if defined __x86_64__ movq $STDOUT, %rdi #ifdef __linux__ movq $helloworld, %rsi // char * #endif #ifdef __APPLE__ leaq helloworld(%rip), %rsi #endif movq $len, %rdx // length of our string movq $SYS_write, %rax // write system call syscall movq $EXIT_OK, %rdi // Use 0 return code movq $SYS_exit, %rax // exit system call syscall #endif .data helloworld: .ascii "Hello World!\n" len = . - helloworld // len = start - end


(2) To compile and debug for different systems
shell scripts   Select all
# To download the above code using command line. curl -L https://tinyurl.com/helloworld-gas | grep -A200 START_OF_HELLOWORLD.S | sed '1d' | sed -n "/END_OF_HELLOWORLD.S/q;p" | sed 's/&gt;/\>/g;s/&lt;/\</g' > HelloWorld.S # To compile with debug symbols under linux, e.g. Win10 WSL2 or Linux or Android Termux App clang -g -c HelloWorld.S -o HelloWorld.o ; ld HelloWorld.o -o HelloWorld # To compile under macOS (e.g. with M1 cpu) clang -g HelloWorld.S -o HelloWorld_x86_64 -e _start -arch x86_64 clang -g HelloWorld.S -o HelloWorld_arm64 -e _start -arch arm64


(3) To debug using lldb
shell scripts   Select all
# To start program debug lldb HelloWorld_x86_64 # or lldb HelloWorld_arm64 # lldb debug session for arm64 - useful commands (lldb) breakpoint set --name _start (lldb) breakpoint list (lldb) run (lldb) step (lldb) reg read x0 x1 x2 x8 lr pc (lldb) reg read -f t cpsr # lldb debug session for x86_64 - useful commands (lldb) reg read -f d rax rdi rsi rdx rflags (lldb) reg read -f t rflags # print the address value in the stackpointer for x86_64 (lldb) p *(int **)$sp # hint: to search lldb command history use ctrl-r


(4) Summary of differences
4.1) In order to preprocess the assembler file using clang compiler, the filename extension should be capital letter S in linux. Subroutine name between C and global asm labels should prefix by underscore for macOS.
4.2) A64 (arm64) parameter/ results registers are X0-7. If the function has a return value, it will be stored in X0.
4.3) x86_64 parameter registers for integer or pointer are %rdi. %rsi, %rdx, %rcx, %r8, %r9. If the function has a return value, it will be stored in %rax.
4.4) Linux and macOS has different syscall number (x86_64) or Service call number (for arm64). They are defined in this source code.
4.5) Absolute addressing is not allowed for arm64. For macOS, adr instruction can be used for accessing readonly local data. But for non-local data section (which is a buffer in RAM), adrp instruction and @PAGE and @PAGEOFF operators should be used as demo in the code.


Tuesday, April 13, 2021

Mixing C and Assembler for x86_64 and arm64, major differences.

(1) These demo the mixing of C and Assembler Language for x86_64 and arm64 and show the differences in linux and macOS environment.
callsum.c   Select all
/* * callsum.c * * Illustrates how to call the sum function in assembly language. */ #include <stdio.h> double sum(double[], unsigned); int main() { double test[] = { 40.5, 26.7, 21.9, 1.5, -40.5, -23.4 }; printf("%20.7f\n", sum(test, 6)); printf("%20.7f\n", sum(test, 2)); printf("%20.7f\n", sum(test, 0)); printf("%20.7f\n", sum(test, 3)); printf("I am "); #ifdef __ARM_ARCH_ISA_A64 printf(" __ARM_ARCH_ISA_A64 "); #endif #ifdef __arm64__ printf(" __arm64__ "); #endif #ifdef __x86_64__ printf(" __x86_64__ "); #endif #ifdef __linux__ printf(" __linux__ "); #endif #ifdef __APPLE__ printf(" __APPLE__ "); #endif printf("\n"); return 0; }


sum.S   Select all
# --------------------------------------------------------------- # A 64-bit function that returns the sum of the elements in a # floating-point array. The function has prototype: # # double sum(double[] array, unsigned length) # ----------------------------------------------------------------------- #ifdef __linux__ .global sum #endif #ifdef __APPLE__ .global _sum #endif .text #ifdef __ARM_ARCH_ISA_A64 .align 4 #endif #ifdef __linux__ sum: #endif #ifdef __APPLE__ _sum: #endif #ifdef __x86_64__ xorpd %xmm0, %xmm0 // initialize the sum to 0 cmp $0, %rsi // special case for length = 0 je done #endif #if defined __arm64__ || defined __ARM_ARCH_ISA_A64 movi d0, #0 // initialize the sum to 0 // floats in s0-7 and doubles in the d0-7 registers. cmp x1, #0 // special case for length = 0 b.eq done #endif next: #ifdef __x86_64__ addsd (%rdi), %xmm0 // add in the current array element add $8, %rdi // move to next array element dec %rsi // count down jnz next // if not done counting, continue #endif #if defined __arm64__ || defined __ARM_ARCH_ISA_A64 ldr d16, [x0] // load the float into d16 // floats in s0-7 and doubles in the d0-7 registers. fadd d0, d0, d16 // add in the current array element add x0, x0, #8 // move to next array element subs x1, x1, #1 // count down cbnz w1, next // if not done counting, continue #endif done: ret


callfactorial.c   Select all
/* * An application that illustrates calling the factorial function defined elsewhere. */ #include <stdio.h> #include <inttypes.h> #ifdef __USE_C_FUNCTION uint64_t factorial(unsigned n) { return (n <= 1) ? 1 : n * factorial(n-1); } #else uint64_t factorial(unsigned n); #endif int main() { for (unsigned i = 0; i < 20; i++) { #ifdef __linux__ printf("factorial(%2u) = %lu\n", i, factorial(i)); #endif #ifdef __APPLE__ printf("factorial(%2u) = %llu\n", i, factorial(i)); #endif } printf("I am "); #ifdef __ARM_ARCH_ISA_A64 printf(" __ARM_ARCH_ISA_A64 "); #endif #ifdef __arm64__ printf(" __arm64__ "); #endif #ifdef __x86_64__ printf(" __x86_64__ "); #endif #ifdef __linux__ printf(" __linux__ "); #endif #ifdef __APPLE__ printf(" __APPLE__ "); #endif printf("\n"); }


factorial.S   Select all
# ---------------------------------------------------------------------------- # A 64-bit recursive implementation of the function # # uint64_t factorial(unsigned n) # # implemented recursively # ---------------------------------------------------------------------------- #ifdef __linux__ .globl factorial #endif #ifdef __APPLE__ .globl _factorial #endif .text #ifdef __ARM_ARCH_ISA_A64 .align 4 #endif #ifdef __linux__ factorial: #endif #ifdef __APPLE__ _factorial: #endif #if defined __arm64__ || defined __ARM_ARCH_ISA_A64 cmp x8, #1 //# n > 1? b.gt L1 //# if yes, go do a recursive call mov x0, #1 //# otherwise return 1 ret #endif #ifdef __x86_64__ cmp $1, %rdi # n <= 1? jnbe L1 # if not, go do a recursive call mov $1, %rax # otherwise return 1 ret #endif L1: #if defined __arm64__ || defined __ARM_ARCH_ISA_A64 STP X8, LR, [SP, #-16]! //# push x8 and LR(x30) // LR is used to return from subroutine subs x8, x8, #1 //# n-1 #ifdef __linux__ bl factorial //# factorial(n-1), result goes in x0 #endif #ifdef __APPLE__ bl _factorial //# factorial(n-1), result goes in x0 #endif LDP X8, LR, [SP], #16 //# pop x8 and LR(x30) mul x0, x0, x8 //# n * factorial(n-1), stored in x0 ret #endif #ifdef __x86_64__ push %rdi # save n on stack (also aligns %rsp!) dec %rdi # n-1 #ifdef __linux__ call factorial # factorial(n-1), result goes in %rax #endif #ifdef __APPLE__ call _factorial # factorial(n-1), result goes in %rax #endif pop %rdi # restore n imul %rdi, %rax # n * factorial(n-1), stored in %rax ret #endif


callmaxofthree.c   Select all
/* * callmaxofthree.c * * A small program that illustrates how to call the maxofthree function we wrote in * assembly language. */ #include <stdio.h> #include <inttypes.h> int64_t maxofthree(int64_t, int64_t, int64_t); int main() { #ifdef __linux__ printf("%ld\n", maxofthree(1, -4, -7)); printf("%ld\n", maxofthree(2, -6, 1)); printf("%ld\n", maxofthree(2, 3, 1)); printf("%ld\n", maxofthree(-2, 4, 3)); printf("%ld\n", maxofthree(2, -6, 5)); printf("%ld\n", maxofthree(2, 4, 6)); #endif #ifdef __APPLE__ printf("%lld\n", maxofthree(1, -4, -7)); printf("%lld\n", maxofthree(2, -6, 1)); printf("%lld\n", maxofthree(2, 3, 1)); printf("%lld\n", maxofthree(-2, 4, 3)); printf("%lld\n", maxofthree(2, -6, 5)); printf("%lld\n", maxofthree(2, 4, 6)); #endif printf("I am "); #ifdef __ARM_ARCH_ISA_A64 printf(" __ARM_ARCH_ISA_A64 "); #endif #ifdef __arm64__ printf(" __arm64__ "); #endif #ifdef __x86_64__ printf(" __x86_64__ "); #endif #ifdef __linux__ printf(" __linux__ "); #endif #ifdef __APPLE__ printf(" __APPLE__ "); #endif printf("\n"); return 0; }


maxofthree.S   Select all
# ----------------------------------------------------------------------------- # A 64-bit function that returns the maximum value of its three 64-bit integer # arguments. The function has signature: # # int64_t maxofthree(int64_t x, int64_t y, int64_t z) # # Note that the parameters for x86_64 have already been passed in rdi, rsi, and rdx. We # Note that the parameters for arm64 have already been passed in x0, x1, x2. We # just have to return the value in rax(x86_64), x0(arm64). # ----------------------------------------------------------------------------- #ifdef __linux__ .globl maxofthree #endif #ifdef __APPLE__ .globl _maxofthree #endif .text #ifdef __ARM_ARCH_ISA_A64 .align 4 #endif #ifdef __linux__ maxofthree: #endif #ifdef __APPLE__ _maxofthree: #endif #if defined __arm64__ || defined __ARM_ARCH_ISA_A64 cmp x0, x1 //# is x0 > x1 csel x0, x0, x1, GT // if GT, x0 = x0 else x0 = x1 cmp x0, x2 //# is x0 > x2 csel x0, x0, x2, GT // if GT, x0 = x0 else x0 = x2 ret //# the max will be in x0 #endif #ifdef __x86_64__ mov %rdi, %rax # result (rax) initially holds x cmp %rsi, %rax # is x less than y? cmovl %rsi, %rax # if so, set result to y cmp %rdx, %rax # is max(x,y) less than z? cmovl %rdx, %rax # if so, set result to z ret # the max will be in eax #endif


chaskey.h   Select all
#ifndef CHASKEY_H #define CHASKEY_H #define CHASKEY_ENCRYPT 1 #define CHASKEY_DECRYPT 0 #ifdef __cplusplus extern "C" { #endif void chas_encrypt(int, void*, void*); void chaskey(void*, void*); void chas_encryptx(void*, void*); #ifdef __cplusplus } #endif #endif


testckey.c   Select all
// test unit for chaskey #include <stdio.h> #include <string.h> #include <inttypes.h> #include "chaskey.h" uint8_t plain[16]= { 0xb8, 0x23, 0x28, 0x26, 0xfd, 0x5e, 0x40, 0x5e, 0x69, 0xa3, 0x01, 0xa9, 0x78, 0xea, 0x7a, 0xd8 }; uint8_t key[16] = { 0x56, 0x09, 0xe9, 0x68, 0x5f, 0x58, 0xe3, 0x29, 0x40, 0xec, 0xec, 0x98, 0xc5, 0x22, 0x98, 0x2f }; uint8_t cipher[16] = { 0xd5, 0x60, 0x8d, 0x4d, 0xa2, 0xbf, 0x34, 0x7b, 0xab, 0xf8, 0x77, 0x2f, 0xdf, 0xed, 0xde, 0x07 }; int main(void) { uint8_t t[16]; int e; memcpy(t, plain, 16); chaskey(key, t); e = memcmp(t, cipher, 16)==0; printf("\nCHASKEY Encryption: %s\n", e ? "OK" : "FAILED"); printf("I am "); #ifdef __ARM_ARCH_ISA_A64 printf(" __ARM_ARCH_ISA_A64 "); #endif #ifdef __arm64__ printf(" __arm64__ "); #endif #ifdef __x86_64__ printf(" __x86_64__ "); #endif #ifdef __linux__ printf(" __linux__ "); #endif #ifdef __APPLE__ printf(" __APPLE__ "); #endif printf("\n"); return 0; }


ckey.S   Select all
// CHASKEY in ARM64 assembly // Chaskey-LTS Block Cipher in AMD64 assembly (Encryption only) .text #ifdef __x86_64__ .intel_syntax noprefix #endif .globl chaskey .globl _chaskey #ifdef __ARM_ARCH_ISA_A64 .align 4 #endif // chaskey(void*mk, void*data); chaskey: _chaskey: #if defined __arm64__ || defined __ARM_ARCH_ISA_A64 // load 128-bit key ldp w2, w3, [x0] ldp w4, w5, [x0, 8] // load 128-bit plain text ldp w6, w7, [x1] ldp w8, w9, [x1, 8] // xor plaintext with key eor w6, w6, w2 // x[0] ^= k[0]; eor w7, w7, w3 // x[1] ^= k[1]; eor w8, w8, w4 // x[2] ^= k[2]; eor w9, w9, w5 // x[3] ^= k[3]; mov w10, 16 // i = 16 #endif #ifdef __x86_64__ // .intel_syntax noprefix push rbx push rbp push rsi # load plaintext lodsd xchg eax, ebp lodsd xchg eax, ebx lodsd xchg eax, edx lodsd xchg eax, ebp # pre-whiten xor eax, [rdi ] xor ebx, [rdi+ 4] xor edx, [rdi+ 8] xor ebp, [rdi+12] push 16 pop rcx #endif L0: #if defined __arm64__ || defined __ARM_ARCH_ISA_A64 add w6, w6, w7 // x[0] += x[1]; eor w7, w6, w7, ror 27 // x[1]=R(x[1],27) ^ x[0]; add w8, w8, w9 // x[2] += x[3]; eor w9, w8, w9, ror 24 // x[3]=R(x[3],24) ^ x[2]; add w8, w8, w7 // x[2] += x[1]; ror w6, w6, 16 add w6, w9, w6 // x[0]=R(x[0],16) + x[3]; eor w9, w6, w9, ror 19 // x[3]=R(x[3],19) ^ x[0]; eor w7, w8, w7, ror 25 // x[1]=R(x[1],25) ^ x[2]; ror w8, w8, 16 // x[2]=R(x[2],16); subs w10, w10, 1 // i-- bne L0 // i > 0 // xor cipher text with key eor w6, w6, w2 // x[0] ^= k[0]; eor w7, w7, w3 // x[1] ^= k[1]; eor w8, w8, w4 // x[2] ^= k[2]; eor w9, w9, w5 // x[3] ^= k[3]; // save 128-bit cipher text stp w6, w7, [x1] stp w8, w9, [x1, 8] ret #endif #ifdef __x86_64__ // .intel_syntax noprefix # x[0] += x[1]# add eax, ebx # x[1]=ROTR32(x[1],27) ^ x[0] ror ebx, 27 xor ebx, eax # x[2] += x[3]# add edx, ebp # x[3]=ROTR32(x[3],24) ^ x[2] ror ebp, 24 xor ebp, edx # x[2] += x[1]# add edx, ebx # x[0]=ROTR32(x[0],16) + x[3] ror eax, 16 add eax, ebp # x[3]=ROTR32(x[3],19) ^ x[0] ror ebp, 19 xor ebp, eax # x[1]=ROTR32(x[1],25) ^ x[2] ror ebx, 25 xor ebx, edx # x[2]=ROTR32(x[2],16) ror edx, 16 loop L0 # post-whiten xor eax, [rdi ] xor ebx, [rdi+ 4] xor edx, [rdi+ 8] xor ebp, [rdi+12] pop rdi # save ciphertext stosd xchg eax, ebx stosd xchg eax, edx stosd xchg eax, ebp stosd pop rbp pop rbx ret #endif


speck.h   Select all
#ifndef SPECK_H #define SPECK_H #ifdef __cplusplus extern "C" { #endif void speck64(void*, void*); void speck128(void*, void*); #ifdef __cplusplus } #endif #endif


testspk.c   Select all
// test unit for speck #include <stdio.h> #include <string.h> #include <inttypes.h> #include "speck.h" void print_bytes(char *s, void *p, int len) { int i; printf("%s : ", s); for (i=0; i<len; i++) { printf ("%02x ", ((uint8_t*)p)[i]); } putchar('\n'); } // SPECK64/128 test vectors // // p = 0x3b7265747475432d uint8_t plain64[]= { 0x74, 0x65, 0x72, 0x3b, 0x2d, 0x43, 0x75, 0x74 }; // c = 0x8c6fa548454e028b uint8_t cipher64[]= { 0x48, 0xa5, 0x6f, 0x8c, 0x8b, 0x02, 0x4e, 0x45 }; // key = 0x03020100, 0x0b0a0908, 0x13121110, 0x1b1a1918 uint8_t key64[]= { 0x00, 0x01, 0x02, 0x03, 0x08, 0x09, 0x0a, 0x0b, 0x10, 0x11, 0x12, 0x13, 0x18, 0x19, 0x1a, 0x1b }; // SPECK128/256 test vectors // uint8_t key128[]= { 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f, 0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17, 0x18, 0x19, 0x1a, 0x1b, 0x1c, 0x1d, 0x1e, 0x1f }; uint8_t plain128[]= { 0x70, 0x6f, 0x6f, 0x6e, 0x65, 0x72, 0x2e, 0x20, 0x49, 0x6e, 0x20, 0x74, 0x68, 0x6f, 0x73, 0x65}; uint64_t cipher128[2] = {0x4eeeb48d9c188f43, 0x4109010405c0f53e}; #define R(v,n)(((v)>>(n))|((v)<<(64-(n)))) #define F(n)for(i=0;i<n;i++) typedef unsigned long long W; void speck128x(void*mk,void*in){ W i,t,k[4],r[2]; memcpy(r,in,16); memcpy(k,mk,32); F(34) r[1]=(R(r[1],8)+*r)^*k, *r=R(*r,61)^r[1], t=k[3], k[3]=(R(k[1],8)+*k)^i, *k=R(*k,61)^k[3], k[1]=k[2],k[2]=t; memcpy(in,r,16); } int main (void) { uint64_t buf[4]; int equ; // copy plain text to local buffer memcpy (buf, plain64, sizeof(plain64)); speck64(key64, buf); equ = memcmp(cipher64, buf, sizeof(cipher64))==0; printf ("\nSPECK64/128 encryption %s\n", equ ? "OK" : "FAILED"); print_bytes("CT result ", buf, sizeof(plain64)); print_bytes("CT expected", cipher64, sizeof(cipher64)); print_bytes("K ", key64, sizeof(key64)); print_bytes("PT", plain64, sizeof(plain64)); // copy plain text to local buffer memcpy (buf, plain128, sizeof(plain128)); #ifdef __USE_C_FUNCTION speck128x(key128, buf); #else speck128(key128, buf); #endif equ = memcmp(cipher128, buf, sizeof(cipher128))==0; printf ("\nSPECK128/256 encryption %s\n", equ ? "OK" : "FAILED"); print_bytes("CT result ", buf, sizeof(plain128)); print_bytes("CT expected", cipher128, sizeof(cipher128)); print_bytes("K ", key128, sizeof(key128)); print_bytes("PT", plain128, sizeof(plain128)); printf("I am "); #ifdef __ARM_ARCH_ISA_A64 printf(" __ARM_ARCH_ISA_A64 "); #endif #ifdef __arm64__ printf(" __arm64__ "); #endif #ifdef __x86_64__ printf(" __x86_64__ "); #endif #ifdef __linux__ printf(" __linux__ "); #endif #ifdef __APPLE__ printf(" __APPLE__ "); #endif printf("\n"); return 0; }


spk64.S   Select all
// SPECK64/128 in ARM64 assembly // SPECK-64/128 Block Cipher in x86 assembly (Encryption only) #ifdef __x86_64__ .intel_syntax noprefix #endif .text .globl speck64 .globl _speck64 #ifdef __ARM_ARCH_ISA_A64 .align 4 #endif // speck64(void*mk, void*data); speck64: _speck64: #if defined __arm64__ || defined __ARM_ARCH_ISA_A64 // load 128-bit key // k0 = k[0]; k1 = k[1]; k2 = k[2]; k3 = k[3]; ldp w5, w6, [x0] ldp w7, w8, [x0, 8] // load 64-bit plain text ldp w2, w4, [x1] // x0 = x[0]; x1 = k[1]; mov w3, wzr // i=0 #endif #ifdef __x86_64__ push rbx push rbp push rsi # save lodsd xchg eax, ebx # ebx = in[0] lodsd xchg eax, edx # edx = in[1] push rdi pop rsi lodsd xchg eax, edi # edi = key[0] lodsd xchg eax, ebp # ebp = key[1] lodsd xchg eax, ecx # ecx = key[2] lodsd xchg eax, esi # esi = key[3] xor eax, eax # i = 0 #endif L0: #if defined __arm64__ || defined __ARM_ARCH_ISA_A64 ror w2, w2, 8 add w2, w2, w4 // x0 = (R(x0, 8) + x1) ^ k0; eor w2, w2, w5 // eor w4, w2, w4, ror 29 // x1 = R(x1, 3) ^ x0; mov w9, w8 // backup k3 ror w6, w6, 8 add w8, w5, w6 // k3 = (R(k1, 8) + k0) ^ i; eor w8, w8, w3 // eor w5, w8, w5, ror 29 // k0 = R(k0, 3) ^ k3; mov w6, w7 // k1 = k2; mov w7, w9 // k2 = t; add w3, w3, 1 // i++; cmp w3, 27 // i < 27; bne L0 // save result stp w2, w4, [x1] // x[0] = x0; x[1] = x1; ret #endif #ifdef __x86_64__ # ebx = (ROTR32(ebx, 8) + edx) ^ edi; ror ebx, 8 add ebx, edx xor ebx, edi # edx = ROTR32(edx, 29) ^ ebx; ror edx, 29 xor edx, ebx # ebp = (ROTR32(ebp, 8) + edi) ^ i; ror ebp, 8 add ebp, edi xor ebp, eax # edi = ROTR32(edi, 29) ^ ebp; ror edi, 29 xor edi, ebp xchg esi, ecx xchg esi, ebp # i++ inc al cmp al, 27 jnz L0 pop rdi xchg eax, ebx stosd xchg eax, edx stosd pop rbp pop rbx ret #endif


spk128.S   Select all
// SPECK128/256 in ARM64 assembly // SPECK-128/256 Block Cipher in AMD64 assembly (Encryption only) #ifdef __x86_64__ .intel_syntax noprefix #endif .text .global speck128 .global _speck128 #ifdef __ARM_ARCH_ISA_A64 .align 4 #endif // speck128(void*mk, void*data); speck128: _speck128: #if defined __arm64__ || defined __ARM_ARCH_ISA_A64 // load 256-bit key // k0 = k[0]; k1 = k[1]; k2 = k[2]; k3 = k[3]; ldp x5, x6, [x0] ldp x7, x8, [x0, 16] // load 128-bit plain text ldp x2, x4, [x1] // x0 = x[0]; x1 = k[1]; mov x3, xzr // i=0 #endif #ifdef __x86_64__ push rbp push rbx push rdi push rsi # load 128-bit plaintext mov rbp, [rsi ] mov rsi, [rsi+8] # load 256-bit key mov rbx, [rdi ] # k0 mov rcx, [rdi+ 8] # k1 mov rdx, [rdi+16] # k2 mov rdi, [rdi+24] # k3 # i = 0 xor eax, eax #endif L0: #if defined __arm64__ || defined __ARM_ARCH_ISA_A64 ror x4, x4, 8 add x4, x4, x2 // x1 = (R(x1, 8) + x0) ^ k0; eor x4, x4, x5 // eor x2, x4, x2, ror 61 // x0 = R(x0, 61) ^ x1; mov x9, x8 // backup k3 ror x6, x6, 8 add x8, x5, x6 // k3 = (R(k1, 8) + k0) ^ i; eor x8, x8, x3 // eor x5, x8, x5, ror 61 // k0 = R(k0, 61) ^ k3; mov x6, x7 // k1 = k2; mov x7, x9 // k2 = t; add x3, x3, 1 // i++; cmp x3, 34 // i < 34; bne L0 // save result stp x2, x4, [x1] // x[0] = x0; x[1] = x1; ret #endif #ifdef __x86_64__ # x[1] = (R(x[1], 8) + x[0]) ^ k[0]; ror rsi, 8 add rsi, rbp xor rsi, rbx # x[0] = R(x[0], 61) ^ x[1]; ror rbp, 61 xor rbp, rsi # k[1] = (R(k[1], 8) + k[0]) ^ i; ror rcx, 8 add rcx, rbx xor cl, al # k[0] = R(k[0], 61) ^ k[3]; ror rbx, 61 xor rbx, rcx # X(k3, k2), X(k3, k1); xchg rdi, rdx xchg rdi, rcx # i++ inc al cmp al, 34 jnz L0 pop rax push rax # save 128-bit result mov [rax ], rbp mov [rax+8], rsi pop rsi pop rdi pop rbx pop rbp ret #endif


(2) Compile and Linking
shell script   Select all
# use clang to compile and link # To compile and link the above in Linux (e.g. tested in Android arm64 Termux App or Windows 10 WSL2 and clang package should be installed) clang callsum.c sum.S -o callsum; clang callfactorial.c factorial.S -o callfactorial; clang maxofthree.S callmaxofthree.c -o callmaxofthree; clang testckey.c ckey.S -o testckey; clang testspk.c spk64.S spk128.S -o testspk; #To compile and link the above in macOS (new M1 machine is capable to run x86_64 and arm64 binaries with Rosetta 2 installed). To compile on macOS, XCode, Command Line Utility and Rosetta 2 should be installed. clang callsum.c sum.S -o callsum_x86_64 -arch x86_64; clang callfactorial.c factorial.S -o callfactorial_x86_64 -arch x86_64; clang maxofthree.S callmaxofthree.c -o callmaxofthree_x86_64 -arch x86_64; clang testckey.c ckey.S -o testckey_x86_64 -arch x86_64; clang testspk.c spk64.S spk128.S -o testspk_x86_64 -arch x86_64; clang callsum.c sum.S -o callsum_arm64 -arch arm64; clang callfactorial.c factorial.S -o callfactorial_arm64 -arch arm64; clang maxofthree.S callmaxofthree.c -o callmaxofthree_arm64 -arch arm64; clang testckey.c ckey.S -o testckey_arm64 -arch arm64; clang testspk.c spk64.S spk128.S -o testspk_arm64 -arch arm64; In order to debug say using lldb, add -g option when compile and, in addition, macOS has to codesign with enttlements


(3) Summary of differences
3.1) In order to preprocess the assembler file using clang compiler, the filename extension should be capital letter S in linux. Subroutine name between C and global asm labels should prefix by underscore for macOS.
3.2) A64 (arm64) instruction set does not include an explicit stack push instruction. Functions can use the stp and ldp (load pair of registers) to carry out the push and pop operations as demo in factorial.S source code above.
3.3) Most Armv8-64 platforms (e.g. macOS) require quadword (16-byte) alignment of the SP register.
3.4) A64 (arm64) parameter/ results registers are X0-7.   X8 is designated as the Indirect Result Location Parameter and X30 (LR) is the Link Register. If the function has a return value, it will be stored in X0.  A64 (arm64) floating point result registers are S0 or D0  as demo in sum.S
3.5) x86_64 parameter registers for integer or pointer are %rdi. %rsi, %rdx, %rcx, %r8, %r9. If the function has a return value, it will be stored in %rax.  x86_64 floating point result registers are %xmm0.  as demo in sum.S
3.6) by using the directive .intel_syntax noprefix, the x86_64 intel syntax assembly code can be used where the first assembler operand usually is the destination operand where the order is similar to that of arm64 code. In addition the prefix % can be omitted when using noprefix.

(4) To download the above source code using command line
curl -L https://tinyurl.com/mixcasm | grep -A200 START_OF_CALLSUM.C | sed '1d' | sed -n "/END_OF_CALLSUM.C/q;p" | sed 's/&gt;/\>/g;s/&lt;/\</g' > callsum.c
curl -L https://tinyurl.com/mixcasm | grep -A200 START_OF_SUM.S | sed '1d' | sed -n "/END_OF_SUM.S/q;p" | sed 's/&gt;/\>/g;s/&lt;/\</g' > sum.S

curl -L https://tinyurl.com/mixcasm | grep -A200 START_OF_CALLFACTORIAL.C | sed '1d' | sed -n "/END_OF_CALLFACTORIAL.C/q;p" | sed 's/&gt;/\>/g;s/&lt;/\</g' > callfactorial.c
curl -L https://tinyurl.com/mixcasm | grep -A200 START_OF_FACTORIAL.S | sed '1d' | sed -n "/END_OF_FACTORIAL.S/q;p" | sed 's/&gt;/\>/g;s/&lt;/\</g' > factorial.S

curl -L https://tinyurl.com/mixcasm | grep -A200 START_OF_CALLMAXOFTHREE.C | sed '1d' | sed -n "/END_OF_CALLMAXOFTHREE.C/q;p" | sed 's/&gt;/\>/g;s/&lt;/\</g' > callmaxofthree.c
curl -L https://tinyurl.com/mixcasm | grep -A200 START_OF_MAXOFTHREE.S | sed '1d' | sed -n "/END_OF_MAXOFTHREE.S/q;p" | sed 's/&gt;/\>/g;s/&lt;/\</g' > maxofthree.S

curl -L https://tinyurl.com/mixcasm | grep -A200 START_OF_CHASKEY.H | sed '1d' | sed -n "/END_OF_CHASKEY.H/q;p" | sed 's/&gt;/\>/g;s/&lt;/\</g' > chaskey.h
curl -L https://tinyurl.com/mixcasm | grep -A200 START_OF_TESTCKEY.C | sed '1d' | sed -n "/END_OF_TESTCKEY.C/q;p" | sed 's/&gt;/\>/g;s/&lt;/\</g' > testckey.c
curl -L https://tinyurl.com/mixcasm | grep -A200 START_OF_CKEY.S | sed '1d' | sed -n "/END_OF_CKEY.S/q;p" | sed 's/&gt;/\>/g;s/&lt;/\</g' > ckey.S

curl -L https://tinyurl.com/mixcasm | grep -A200 START_OF_SPECK.H | sed '1d' | sed -n "/END_OF_SPECK.H/q;p" | sed 's/&gt;/\>/g;s/&lt;/\</g' > speck.h
curl -L https://tinyurl.com/mixcasm | grep -A200 START_OF_TESTSPK.C | sed '1d' | sed -n "/END_OF_TESTSPK.C/q;p" | sed 's/&gt;/\>/g;s/&lt;/\</g' > testspk.c
curl -L https://tinyurl.com/mixcasm | grep -A200 START_OF_SPK64.S | sed '1d' | sed -n "/END_OF_SPK64.S/q;p" | sed 's/&gt;/\>/g;s/&lt;/\</g' > spk64.S
curl -L https://tinyurl.com/mixcasm | grep -A200 START_OF_SPK128.S | sed '1d' | sed -n "/END_OF_SPK128.S/q;p" | sed 's/&gt;/\>/g;s/&lt;/\</g' > spk128.S



Saturday, December 5, 2020

How to create custom docker image for arm64

(1) When starting to use docker for arm64 architecture e.g. on M1, you might notice that there are missiing custom docker image for arm64, so there is a need to build custom image for self.

(2) When there is docker image for AMD64, you can pull them and use docker history --no-trunc to view the build commands

(3) And then create a Dockerfile to build it in your arm64 environment, it is also possibe to cross compile it in AMD64 CPU environment.

(4) For example, the creation Dockerfile to build for Quantlib juypter notebook server is as below.
P.S. You need more RAM to build using gcc, preferably 4GB to 8GB

Shell script   Select all
cd $HOME mkdir -p my-quantlib cd my-quantlib # get helloworld.ipynb wget https://raw.githubusercontent.com/lballabio/dockerfiles/master/quantlib-jupyter/Hello%20world.ipynb cat >$HOME/my-quantlib/Dockerfile_ql_1.20 <<'HEREEOF' # Dockerfile_ql_1.20 # docker build -f Dockerfile_ql_1.20 -t arm64v8/quantlib:1.20 . # docker buildx build --platform linux/arm64 -t arm64v8/quantlib:1.20 . # Build Quantlib libraries for arm64v8 ARG tag=latest FROM arm64v8/ubuntu:20.04 MAINTAINER Luigi Ballabio <luigi.ballabio@gmail.com> LABEL Description="Provide a building environment where the QuantLib Python jupyter-notebook" RUN apt-get update \ && DEBIAN_FRONTEND=noninteractive apt-get install -y build-essential wget libbz2-dev vim git ENV boost_version=1.67.0 ENV boost_dir=boost_1_67_0 # Build boost RUN echo 'Building boost ...' #RUN wget https://dl.bintray.com/boostorg/release/${boost_version}/source/${boost_dir}.tar.gz \ RUN wget https://nchc.dl.sourceforge.net/project/boost/boost/${boost_version}/${boost_dir}.tar.gz \ && tar xfz ${boost_dir}.tar.gz \ && rm ${boost_dir}.tar.gz \ && cd ${boost_dir} \ && ./bootstrap.sh \ && ./b2 --without-python --prefix=/usr -j 4 link=shared runtime-link=shared install \ && cd .. && rm -rf ${boost_dir} && ldconfig # Build Quantlib C++ RUN echo 'Building Quantlib C++ ...' ENV quantlib_version=1.20 #RUN wget https://dl.bintray.com/quantlib/releases/QuantLib-${quantlib_version}.tar.gz \ RUN wget https://github.com/lballabio/QuantLib/releases/download/QuantLib-v1.20/QuantLib-${quantlib_version}.tar.gz \ && tar xfz QuantLib-${quantlib_version}.tar.gz \ && rm QuantLib-${quantlib_version}.tar.gz \ && cd QuantLib-${quantlib_version} \ && ./configure --prefix=/usr --disable-static CXXFLAGS=-O3 \ && make -j 4 && make check && make install \ && make clean \ && cd .. && ldconfig # && cd .. && rm -rf QuantLib-${quantlib_version} && ldconfig # Build Quantlib-Python RUN echo 'Build Quantlib-Python ...' RUN apt-get update \ && DEBIAN_FRONTEND=noninteractive apt-get install -y swig python3 python3-pip python-dev libgomp1 # Build Quantlib for Python3 RUN echo 'Install Quantlib Python' ENV quantlib_swig_version=1.20 #RUN wget https://dl.bintray.com/quantlib/releases/QuantLib-SWIG-${quantlib_swig_version}.tar.gz \ RUN wget https://github.com/lballabio/QuantLib-SWIG/releases/download/QuantLib-SWIG-v${quantlib_swig_version}/QuantLib-SWIG-${quantlib_swig_version}.tar.gz \ && tar xfz QuantLib-SWIG-${quantlib_swig_version}.tar.gz \ && rm QuantLib-SWIG-${quantlib_swig_version}.tar.gz \ && cd QuantLib-SWIG-${quantlib_swig_version} \ && ./configure CXXFLAGS="--param ggc-min-expand=1 --param ggc-min-heapsize=32768" PYTHON=/usr/bin/python3 \ && make -C Python && make -C Python check && make -C Python install \ && cd .. && rm -rf QuantLib-SWIG-${quantlib_swig_version} && ldconfig # Build jupyter-notebook server RUN python3 -c "print('\033[91m Building jupyter-notebook server ... \033[0m')" RUN pip3 install --no-cache-dir jupyter jupyterlab matplotlib numpy scipy pandas ipywidgets RISE RUN jupyter-nbextension install rise --py --sys-prefix RUN jupyter-nbextension install widgetsnbextension --py --sys-prefix \ && jupyter-nbextension enable widgetsnbextension --py --sys-prefix # Build Quantlib for Python2 RUN apt-get update \ && DEBIAN_FRONTEND=noninteractive apt-get install -y python \ && apt-get clean RUN wget https://bootstrap.pypa.io/pip/2.7/get-pip.py \ && python2 get-pip.py \ && rm get-pip.py #RUN wget https://dl.bintray.com/quantlib/releases/QuantLib-SWIG-${quantlib_swig_version}.tar.gz \ RUN wget https://github.com/lballabio/QuantLib-SWIG/releases/download/QuantLib-SWIG-v${quantlib_swig_version}/QuantLib-SWIG-${quantlib_swig_version}.tar.gz \ && tar xfz QuantLib-SWIG-${quantlib_swig_version}.tar.gz \ && rm QuantLib-SWIG-${quantlib_swig_version}.tar.gz \ && cd QuantLib-SWIG-${quantlib_swig_version} \ && ./configure CXXFLAGS="--param ggc-min-expand=1 --param ggc-min-heapsize=32768" \ && make -C Python && make -C Python check && make -C Python install \ && cd .. && rm -rf QuantLib-SWIG-${quantlib_swig_version} && ldconfig RUN pip2 install --no-cache-dir numpy EXPOSE 8888 RUN mkdir /notebooks VOLUME /notebooks COPY *.ipynb /notebooks/ # Starting jupyter-notebook server RUN python3 -c "print('\033[92m Starting jupyter-notebook server at port 8888 \033[0m')" CMD jupyter notebook --no-browser --allow-root --ip=0.0.0.0 --port=8888 --notebook-dir=/notebooks HEREEOF # build image docker build -f Dockerfile_ql_1.20 -t arm64v8/quantlib:1.20 . # run image docker run -d -p 8888:8888 --name myquantlibtesting arm64v8/quantlib:1.20 # list the token of the jupyter-notebook server docker container exec -it myquantlibtesting jupyter notebook list


(5) Testing QuantLib C++ libraries and Quantlib for Python2 and Python3
Shell script   Select all
#create and start container for testing docker run -it --rm --name myquantlib arm64v8/quantlib:1.20 /bin/bash #Create testql.cpp cd $HOME cat > testql.cpp << 'testqlEOF' #include <ql/quantlib.hpp> int main() { std::cout << "BOOST version is " << BOOST_VERSION << std::endl; std::cout << "QL version is " << QL_VERSION << std::endl; #if __x86_64__ || __WORDSIZE == 64 std::cout << "This is 64 bits" << std::endl; #elif __i386__ || __WORDSIZE == 32 std::cout << "This is 32 bits" << std::endl; #else std::cout << "This is something else" << std::endl; #endif return 0; } testqlEOF g++ testql.cpp -lQuantLib -o testql ./testql # Test QuantLib C++ Examples cd $HOME g++ /QuantLib-*/Examples/Bonds/Bonds.cpp -lQuantLib -o testBonds ./testBonds cd $HOME g++ /QuantLib-*/Examples/FRA/FRA.cpp -lQuantLib -o testFRA ./testFRA # Test python 3 QuantLib cd $HOME cat > $HOME/swap.py <<EOF from __future__ import print_function import numpy as np import QuantLib as ql print("QuantLib version is", ql.__version__) # Set Evaluation Date today = ql.Date(31,3,2015) ql.Settings.instance().setEvaluationDate(today) # Setup the yield termstructure rate = ql.SimpleQuote(0.03) rate_handle = ql.QuoteHandle(rate) dc = ql.Actual365Fixed() disc_curve = ql.FlatForward(today, rate_handle, dc) disc_curve.enableExtrapolation() hyts = ql.YieldTermStructureHandle(disc_curve) discount = np.vectorize(hyts.discount) start = ql.TARGET().advance(today, ql.Period('2D')) end = ql.TARGET().advance(start, ql.Period('10Y')) nominal = 1e7 typ = ql.VanillaSwap.Payer fixRate = 0.03 fixedLegTenor = ql.Period('1y') fixedLegBDC = ql.ModifiedFollowing fixedLegDC = ql.Thirty360(ql.Thirty360.BondBasis) index = ql.Euribor6M(ql.YieldTermStructureHandle(disc_curve)) spread = 0.0 fixedSchedule = ql.Schedule(start, end, fixedLegTenor, index.fixingCalendar(), fixedLegBDC, fixedLegBDC, ql.DateGeneration.Backward, False) floatSchedule = ql.Schedule(start, end, index.tenor(), index.fixingCalendar(), index.businessDayConvention(), index.businessDayConvention(), ql.DateGeneration.Backward, False) swap = ql.VanillaSwap(typ, nominal, fixedSchedule, fixRate, fixedLegDC, floatSchedule, index, spread, index.dayCounter()) engine = ql.DiscountingSwapEngine(ql.YieldTermStructureHandle(disc_curve)) swap.setPricingEngine(engine) print(swap.NPV()) print(swap.fairRate()) EOF # Test python3 cd $HOME python3 swap.py # Test python2 cd $HOME git clone git://github.com/mmport80/QuantLib-with-Python-Blog-Examples.git cd QuantLib-with-Python-Blog-Examples/ python2 blog_frn_example.py cd $HOME python2 swap.py