Group PJ_SCAN

group PJ_SCAN

Text scanning utility.

This module describes a fast text scanning functions.

Typedefs

typedef void (*pj_syn_err_func_ptr)(struct pj_scanner *scanner)

The callback function type to be called by the scanner when it encounters syntax error.

Parameters

scanner – The scanner instance that calls the callback .

Enums

enum [anonymous]

Flags for scanner.

Values:

enumerator PJ_SCAN_AUTOSKIP_WS

This flags specifies that the scanner should automatically skip whitespaces

enumerator PJ_SCAN_AUTOSKIP_WS_HEADER

This flags specifies that the scanner should automatically skip SIP header continuation. This flag implies PJ_SCAN_AUTOSKIP_WS.

enumerator PJ_SCAN_AUTOSKIP_NEWLINE

Auto-skip new lines.

Functions

void pj_cis_buf_init(pj_cis_buf_t *cs_buf)

Initialize scanner input specification buffer.

Parameters

cs_buf – The scanner character specification.

pj_status_t pj_cis_init(pj_cis_buf_t *cs_buf, pj_cis_t *cis)

Create a new input specification.

Parameters
  • cs_buf – Specification buffer.

  • cis – Character input specification to be initialized.

Returns

PJ_SUCCESS if new specification has been successfully created, or PJ_ETOOMANY if there are already too many specifications in the buffer.

pj_status_t pj_cis_dup(pj_cis_t *new_cis, pj_cis_t *existing)

Create a new input specification based on an existing specification.

Parameters
  • new_cis – The new specification to be initialized.

  • existing – The existing specification, from which the input bitmask will be copied to the new specification.

Returns

PJ_SUCCESS if new specification has been successfully created, or PJ_ETOOMANY if there are already too many specifications in the buffer.

void pj_cis_add_range(pj_cis_t *cis, int cstart, int cend)

Add the characters in the specified range ‘[cstart, cend)’ to the specification (the last character itself (‘cend’) is not added).

Parameters
  • cis – The scanner character specification.

  • cstart – The first character in the range.

  • cend – The next character after the last character in the range.

void pj_cis_add_alpha(pj_cis_t *cis)

Add alphabetic characters to the specification.

Parameters

cis – The scanner character specification.

void pj_cis_add_num(pj_cis_t *cis)

Add numeric characters to the specification.

Parameters

cis – The scanner character specification.

void pj_cis_add_str(pj_cis_t *cis, const char *str)

Add the characters in the string to the specification.

Parameters
  • cis – The scanner character specification.

  • str – The string.

void pj_cis_add_cis(pj_cis_t *cis, const pj_cis_t *rhs)

Add specification from another specification.

Parameters
  • cis – The specification is to be set.

  • rhs – The specification to be copied.

void pj_cis_del_range(pj_cis_t *cis, int cstart, int cend)

Delete characters in the specified range from the specification.

Parameters
  • cis – The scanner character specification.

  • cstart – The first character in the range.

  • cend – The next character after the last character in the range.

void pj_cis_del_str(pj_cis_t *cis, const char *str)

Delete characters in the specified string from the specification.

Parameters
  • cis – The scanner character specification.

  • str – The string.

void pj_cis_invert(pj_cis_t *cis)

Invert specification.

Parameters

cis – The scanner character specification.

int pj_cis_match(const pj_cis_t *cis, pj_uint8_t c)

Check whether the specified character belongs to the specification.

Parameters
  • cis – The scanner character specification.

  • c – The character to check for matching.

Returns

Non-zero if match (not necessarily one).

void pj_scan_init(pj_scanner *scanner, char *bufstart, int buflen, unsigned options, pj_syn_err_func_ptr callback)

Initialize the scanner. Note that the input string buffer must have length at least buflen+1 because the scanner will NULL terminate the string during initialization.

Parameters
  • scanner – The scanner to be initialized.

  • bufstart – The input buffer to scan. Note that buffer[buflen] will be filled with NULL char until scanner is destroyed, so the actual buffer length must be at least buflen+1.

  • buflen – The length of the input buffer, which normally is strlen(bufstart).

  • options – Zero, or combination of PJ_SCAN_AUTOSKIP_WS or PJ_SCAN_AUTOSKIP_WS_HEADER

  • callback – Callback to be called when the scanner encounters syntax error condition.

void pj_scan_fini(pj_scanner *scanner)

Call this function when application has finished using the scanner.

Parameters

scanner – The scanner.

int pj_scan_is_eof(const pj_scanner *scanner)

Determine whether the EOF condition for the scanner has been met.

Parameters

scanner – The scanner.

Returns

Non-zero if scanner is EOF.

int pj_scan_peek(pj_scanner *scanner, const pj_cis_t *spec, pj_str_t *out)

Peek strings in current position according to parameter spec, and return the strings in parameter out. The current scanner position will not be moved. If the scanner is already in EOF state, syntax error callback will be called thrown.

Parameters
  • scanner – The scanner.

  • spec – The spec to match input string.

  • out – String to store the result.

Returns

the character right after the peek-ed position or zero if there’s no more characters.

int pj_scan_peek_n(pj_scanner *scanner, pj_size_t len, pj_str_t *out)

Peek len characters in current position, and return them in out parameter. Note that whitespaces or newlines will be returned as it is, regardless of PJ_SCAN_AUTOSKIP_WS settings. If the character left is less than len, syntax error callback will be called.

Parameters
  • scanner – The scanner.

  • len – Length to peek.

  • out – String to store the result.

Returns

the character right after the peek-ed position or zero if there’s no more characters.

int pj_scan_peek_until(pj_scanner *scanner, const pj_cis_t *spec, pj_str_t *out)

Peek strings in current position until spec is matched, and return the strings in parameter out. The current scanner position will not be moved. If the scanner is already in EOF state, syntax error callback will be called.

Parameters
  • scanner – The scanner.

  • spec – The peeking will stop when the input match this spec.

  • out – String to store the result.

Returns

the character right after the peek-ed position.

void pj_scan_get(pj_scanner *scanner, const pj_cis_t *spec, pj_str_t *out)

Get characters from the buffer according to the spec, and return them in out parameter. The scanner will attempt to get as many characters as possible as long as the spec matches. If the first character doesn’t match the spec, or scanner is already in EOF when this function is called, an exception will be thrown.

Parameters
  • scanner – The scanner.

  • spec – The spec to match input string.

  • out – String to store the result.

void pj_scan_get_unescape(pj_scanner *scanner, const pj_cis_t *spec, pj_str_t *out)

Just like pj_scan_get(), but additionally performs unescaping when escaped (‘’) character is found. The input spec MUST NOT contain the specification for ‘’ characted.

Parameters
  • scanner – The scanner.

  • spec – The spec to match input string.

  • out – String to store the result.

void pj_scan_get_quote(pj_scanner *scanner, int begin_quote, int end_quote, pj_str_t *out)

Get characters between quotes. If current input doesn’t match begin_quote, syntax error will be thrown. Note that the resulting string will contain the enclosing quote.

Parameters
  • scanner – The scanner.

  • begin_quote – The character to begin the quote.

  • end_quote – The character to end the quote.

  • out – String to store the result.

void pj_scan_get_quotes(pj_scanner *scanner, const char *begin_quotes, const char *end_quotes, int qsize, pj_str_t *out)

Get characters between quotes. If current input doesn’t match begin_quote, syntax error will be thrown. Note that the resulting string will contain the enclosing quote.

Parameters
  • scanner – The scanner.

  • begin_quotes – The character array to begin the quotes. For example, the two characters ” and ‘.

  • end_quotes – The character array to end the quotes. The position found in the begin_quotes array will be used to match the end quotes. So if the begin_quotes was the array of “’< the end_quotes should be “’>. If begin_array matched the ‘ then the end_quotes will look for ‘ to match at the end.

  • qsize – The size of the begin_quotes and end_quotes arrays.

  • out – String to store the result.

void pj_scan_get_n(pj_scanner *scanner, unsigned N, pj_str_t *out)

Get N characters from the scanner.

Parameters
  • scanner – The scanner.

  • N – Number of characters to get.

  • out – String to store the result.

int pj_scan_get_char(pj_scanner *scanner)

Get one character from the scanner.

Parameters

scanner – The scanner.

Returns

The character.

void pj_scan_get_until(pj_scanner *scanner, const pj_cis_t *spec, pj_str_t *out)

Get characters from the scanner and move the scanner position until the current character matches the spec.

Parameters
  • scanner – The scanner.

  • spec – Get until the input match this spec.

  • out – String to store the result.

void pj_scan_get_until_ch(pj_scanner *scanner, int until_char, pj_str_t *out)

Get characters from the scanner and move the scanner position until the current character matches until_char.

Parameters
  • scanner – The scanner.

  • until_char – Get until the input match this character.

  • out – String to store the result.

void pj_scan_get_until_chr(pj_scanner *scanner, const char *until_spec, pj_str_t *out)

Get characters from the scanner and move the scanner position until the current character matches until_char.

Parameters
  • scanner – The scanner.

  • until_spec – Get until the input match any of these characters.

  • out – String to store the result.

void pj_scan_advance_n(pj_scanner *scanner, unsigned N, pj_bool_t skip)

Advance the scanner N characters, and skip whitespace if necessary.

Parameters
  • scanner – The scanner.

  • N – Number of characters to skip.

  • skip – Flag to specify whether whitespace should be skipped after skipping the characters.

int pj_scan_strcmp(pj_scanner *scanner, const char *s, int len)

Compare string in current position with the specified string.

Parameters
  • scanner – The scanner.

  • s – The string to compare with.

  • len – Length of the string to compare.

Returns

zero, <0, or >0 (just like strcmp()).

int pj_scan_stricmp(pj_scanner *scanner, const char *s, int len)

Case-less string comparison of current position with the specified string.

Parameters
  • scanner – The scanner.

  • s – The string to compare with.

  • len – Length of the string to compare with.

Returns

zero, <0, or >0 (just like strcmp()).

int pj_scan_stricmp_alnum(pj_scanner *scanner, const char *s, int len)

Perform case insensitive string comparison of string in current position, knowing that the string to compare only consists of alphanumeric characters.

Note that unlike pj_scan_stricmp, this function can only return zero or -1.

See

strnicmp_alnum, pj_stricmp_alnum

Parameters
  • scanner – The scanner.

  • s – The string to compare with.

  • len – Length of the string to compare with.

Returns

zero if equal or -1.

void pj_scan_get_newline(pj_scanner *scanner)

Get a newline from the scanner. A newline is defined as ‘\n’, or ‘\r’, or “\r\n”. If current input is not newline, syntax error will be thrown.

Parameters

scanner – The scanner.

void pj_scan_skip_whitespace(pj_scanner *scanner)

Manually skip whitespaces according to flag that was specified when the scanner was initialized.

Parameters

scanner – The scanner.

void pj_scan_skip_line(pj_scanner *scanner)

Skip current line.

Parameters

scanner – The scanner.

void pj_scan_save_state(const pj_scanner *scanner, pj_scan_state *state)

Save the full scanner state.

Parameters
  • scanner – The scanner.

  • state – Variable to store scanner’s state.

void pj_scan_restore_state(pj_scanner *scanner, pj_scan_state *state)

Restore the full scanner state. Note that this would not restore the string if application has modified it. This will only restore the scanner scanning position.

Parameters
  • scanner – The scanner.

  • state – State of the scanner.

int pj_scan_get_col(const pj_scanner *scanner)

Get current column position.

Parameters

scanner – The scanner.

Returns

The column position.

struct pj_scanner
#include <scanner.h>

The text scanner structure.

struct pj_scan_state
#include <scanner.h>

This structure can be used by application to store the state of the parser, so that the scanner state can be rollback to this state when necessary.