using Brisk.implementation
this is a binary (bitstring) descriptor.
while opencv has a BoW implementation, it requires float features (kmeans is involved), like from SIFT or AKAZE(UPRIGHT).
you cannot use binary descriptors here, as there is no "mean" of 2 bitstrings, and the L2 distance does not apply.
tldr; you either have to switch descriptors, or come up with your own BoW clustering. (e.g. scikit has some clustering algos, that work nicely with binary descriptors like orb or brisk)
[edit] but ok, BoW on a napkin ;)
offline: gather data. a lot. collect a dictionary of the most relevant statistical features, usually by clustering. (the goal is to find the most relevant K features)
online: get "bow" features from your image(?) and use those for classification or such, not the original data.
the most simple idea here is to collect a histogram. you compare each (surf?) feature from your img to all of the dictionary entries (the cluster centers) and increase a bin / counter for the resp. dict feature. for a dict size of 256 you have a list of 256 entries like [3,0,0,18,9,0,...] per image -- your new feature!
** "vlad" and "fisher vectors" are improved ideas