Author: Arvid Norberg, arvid@libtorrent.org
Version: 1.1.1

Bdecoding

bdecode_node

Declared in "libtorrent/bdecode.hpp"

Sometimes it's important to get a non-owning reference to the root node ( to be able to copy it as a reference for instance). For that, use the non_owning() member function.

There are 5 different types of nodes, see type_t.

struct bdecode_node
{
   friend int bdecode (char const* start, char const* end, bdecode_node& ret
      , error_code& ec, int* error_pos, int depth_limit
      , int token_limit);
   bdecode_node ();
   bdecode_node (bdecode_node const&);
   bdecode_node& operator= (bdecode_node const&);
   type_t type () const;
   operator bool () const;
   bdecode_node non_owning () const;
   std::pair<char const*, int> data_section () const;
   bdecode_node list_at (int i) const;
   std::string list_string_value_at (int i
      , char const* default_val = "");
   boost::int64_t list_int_value_at (int i
      , boost::int64_t default_val = 0);
   int list_size () const;
   std::string dict_find_string_value (char const* key
      , char const* default_value = "") const;
   bdecode_node dict_find_string (char const* key) const;
   int dict_size () const;
   boost::int64_t dict_find_int_value (char const* key
      , boost::int64_t default_val = 0) const;
   bdecode_node dict_find (std::string key) const;
   bdecode_node dict_find_list (char const* key) const;
   bdecode_node dict_find (char const* key) const;
   bdecode_node dict_find_dict (std::string key) const;
   bdecode_node dict_find_dict (char const* key) const;
   std::pair<std::string, bdecode_node> dict_at (int i) const;
   bdecode_node dict_find_int (char const* key) const;
   boost::int64_t int_value () const;
   int string_length () const;
   std::string string_value () const;
   char const* string_ptr () const;
   void clear ();
   void swap (bdecode_node& n);
   void reserve (int tokens);
   void switch_underlying_buffer (char const* buf);

   enum type_t
   {
      none_t,
      dict_t,
      list_t,
      string_t,
      int_t,
   };
};

bdecode_node()

bdecode_node ();

creates a default constructed node, it will have the type none_t.

bdecode_node() operator=()

bdecode_node (bdecode_node const&);
bdecode_node& operator= (bdecode_node const&);

For owning nodes, the copy will create a copy of the tree, but the underlying buffer remains the same.

type()

type_t type () const;

the type of this node. See type_t.

bool()

operator bool () const;

returns true if type() != none_t.

non_owning()

bdecode_node non_owning () const;

return a non-owning reference to this node. This is useful to refer to the root node without copying it in assignments.

data_section()

std::pair<char const*, int> data_section () const;

returns the buffer and length of the section in the original bencoded buffer where this node is defined. For a dictionary for instance, this starts with d and ends with e, and has all the content of the dictionary in between.

list_at() list_string_value_at() list_int_value_at() list_size()

bdecode_node list_at (int i) const;
std::string list_string_value_at (int i
      , char const* default_val = "");
boost::int64_t list_int_value_at (int i
      , boost::int64_t default_val = 0);
int list_size () const;

functions with the list_ prefix operate on lists. These functions are only valid if type() == list_t. list_at() returns the item in the list at index i. i may not be greater than or equal to the size of the list. size() returns the size of the list.

dict_size() dict_find_dict() dict_find_string() dict_find_int() dict_at() dict_find_list() dict_find_int_value() dict_find() dict_find_string_value()

std::string dict_find_string_value (char const* key
      , char const* default_value = "") const;
bdecode_node dict_find_string (char const* key) const;
int dict_size () const;
boost::int64_t dict_find_int_value (char const* key
      , boost::int64_t default_val = 0) const;
bdecode_node dict_find (std::string key) const;
bdecode_node dict_find_list (char const* key) const;
bdecode_node dict_find (char const* key) const;
bdecode_node dict_find_dict (std::string key) const;
bdecode_node dict_find_dict (char const* key) const;
std::pair<std::string, bdecode_node> dict_at (int i) const;
bdecode_node dict_find_int (char const* key) const;

Functions with the dict_ prefix operates on dictionaries. They are only valid if type() == dict_t. In case a key you're looking up contains a 0 byte, you cannot use the null-terminated string overloads, but have to use std::string instead. dict_find_list will return a valid bdecode_node if the key is found _and_ it is a list. Otherwise it will return a default-constructed bdecode_node.

Functions with the _value suffix return the value of the node directly, rather than the nodes. In case the node is not found, or it has a different type, a default value is returned (which can be specified).

int_value()

boost::int64_t int_value () const;

this function is only valid if type() == int_t. It returns the value of the integer.

string_ptr() string_length() string_value()

int string_length () const;
std::string string_value () const;
char const* string_ptr () const;

these functions are only valid if type() == string_t. They return the string values. Note that string_ptr() is not null-terminated. string_length() returns the number of bytes in the string.

clear()

void clear ();

resets the bdecoded_node to a default constructed state. If this is an owning node, the tree is freed and all child nodes are invalidated.

swap()

void swap (bdecode_node& n);

Swap contents.

reserve()

void reserve (int tokens);

pre-allocate memory for the specified numbers of tokens. This is useful if you know approximately how many tokens are in the file you are about to parse. Doing so will save realloc operations while parsing. You should only call this on the root node, before passing it in to bdecode().

switch_underlying_buffer()

void switch_underlying_buffer (char const* buf);

this buffer MUST be identical to the one originally parsed. This operation is only defined on owning root nodes, i.e. the one passed in to decode().

enum type_t

Declared in "libtorrent/bdecode.hpp"

name value description
none_t 0 uninitialized or default constructed. This is also used to indicate that a node was not found in some cases.
dict_t 1 a dictionary node. The dict_find_ functions are valid.
list_t 2 a list node. The list_ functions are valid.
string_t 3 a string node, the string_ functions are valid.
int_t 4 an integer node. The int_ functions are valid.

print_entry()

Declared in "libtorrent/bdecode.hpp"

std::string print_entry (bdecode_node const& e
   , bool single_line = false, int indent = 0);

print the bencoded structure in a human-readable format to a string that's returned.

bdecode()

Declared in "libtorrent/bdecode.hpp"

int bdecode (char const* start, char const* end, bdecode_node& ret
   , error_code& ec, int* error_pos = 0, int depth_limit = 100
   , int token_limit = 1000000);

This function decodes/parses bdecoded data (for example a .torrent file). The data structure is returned in the ret argument. the buffer to parse is specified by the start of the buffer as well as the end, i.e. one byte past the end. If the buffer fails to parse, the function returns a non-zero value and fills in ec with the error code. The optional argument error_pos, if set to non-null, will be set to the byte offset into the buffer where the parse failure occurred.

depth_limit specifies the max number of nested lists or dictionaries are allowed in the data structure. (This affects the stack usage of the function, be careful not to set it too high).

token_limit is the max number of tokens allowed to be parsed from the buffer. This is simply a sanity check to not have unbounded memory usage.

The resulting bdecode_node is an owning node. That means it will be holding the whole parsed tree. When iterating lists and dictionaries, those bdecode_node objects will simply have references to the root or owning bdecode_node. If the root node is destructed, all other nodes that refer to anything in that tree become invalid.

However, the underlying buffer passed in to this function (start, end) must also remain valid while the bdecoded tree is used. The parsed tree produced by this function does not copy any data out of the buffer, but simply produces references back into it.