Group PJ_SCAN

group PJ_SCAN

Text scanning utility.

This module describes a fast text scanning functions.

Typedefs

typedef void (*pj_syn_err_func_ptr)(struct pj_scanner *scanner)

The callback function type to be called by the scanner when it encounters syntax error.

Param scanner:

The scanner instance that calls the callback .

Enums

enum [anonymous]

Flags for scanner.

Values:

enumerator PJ_SCAN_AUTOSKIP_WS

This flags specifies that the scanner should automatically skip whitespaces

enumerator PJ_SCAN_AUTOSKIP_WS_HEADER

This flags specifies that the scanner should automatically skip SIP header continuation. This flag implies PJ_SCAN_AUTOSKIP_WS.

enumerator PJ_SCAN_AUTOSKIP_NEWLINE

Auto-skip new lines.

Functions

void pj_cis_buf_init(pj_cis_buf_t *cs_buf)

Initialize scanner input specification buffer.

Parameters:

cs_buf – The scanner character specification.

pj_status_t pj_cis_init(pj_cis_buf_t *cs_buf, pj_cis_t *cis)

Create a new input specification.

Parameters:
  • cs_buf – Specification buffer.

  • cis – Character input specification to be initialized.

Returns:

PJ_SUCCESS if new specification has been successfully created, or PJ_ETOOMANY if there are already too many specifications in the buffer.

pj_status_t pj_cis_dup(pj_cis_t *new_cis, pj_cis_t *existing)

Create a new input specification based on an existing specification.

Parameters:
  • new_cis – The new specification to be initialized.

  • existing – The existing specification, from which the input bitmask will be copied to the new specification.

Returns:

PJ_SUCCESS if new specification has been successfully created, or PJ_ETOOMANY if there are already too many specifications in the buffer.

void pj_cis_add_range(pj_cis_t *cis, int cstart, int cend)

Add the characters in the specified range ‘[cstart, cend)’ to the specification (the last character itself (‘cend’) is not added).

Parameters:
  • cis – The scanner character specification.

  • cstart – The first character in the range.

  • cend – The next character after the last character in the range.

void pj_cis_add_alpha(pj_cis_t *cis)

Add alphabetic characters to the specification.

Parameters:

cis – The scanner character specification.

void pj_cis_add_num(pj_cis_t *cis)

Add numeric characters to the specification.

Parameters:

cis – The scanner character specification.

void pj_cis_add_str(pj_cis_t *cis, const char *str)

Add the characters in the string to the specification.

Parameters:
  • cis – The scanner character specification.

  • str – The string.

void pj_cis_add_cis(pj_cis_t *cis, const pj_cis_t *rhs)

Add specification from another specification.

Parameters:
  • cis – The specification is to be set.

  • rhs – The specification to be copied.

void pj_cis_del_range(pj_cis_t *cis, int cstart, int cend)

Delete characters in the specified range from the specification.

Parameters:
  • cis – The scanner character specification.

  • cstart – The first character in the range.

  • cend – The next character after the last character in the range.

void pj_cis_del_str(pj_cis_t *cis, const char *str)

Delete characters in the specified string from the specification.

Parameters:
  • cis – The scanner character specification.

  • str – The string.

void pj_cis_invert(pj_cis_t *cis)

Invert specification.

Parameters:

cis – The scanner character specification.

int pj_cis_match(const pj_cis_t *cis, pj_uint8_t c)

Check whether the specified character belongs to the specification.

Parameters:
  • cis – The scanner character specification.

  • c – The character to check for matching.

Returns:

Non-zero if match (not necessarily one).

void pj_scan_init(pj_scanner *scanner, char *bufstart, pj_size_t buflen, unsigned options, pj_syn_err_func_ptr callback)

Initialize the scanner. Note that the input string buffer MUST be NULL terminated and have length at least buflen+1 (buflen MUST NOT include the NULL terminator).

Parameters:
  • scanner – The scanner to be initialized.

  • bufstart – The input buffer to scan, which must be NULL terminated.

  • buflen – The length of the input buffer, which normally is strlen(bufstart), hence not counting the NULL terminator.

  • options – Zero, or combination of PJ_SCAN_AUTOSKIP_WS or PJ_SCAN_AUTOSKIP_WS_HEADER

  • callback – Callback to be called when the scanner encounters syntax error condition.

void pj_scan_fini(pj_scanner *scanner)

Call this function when application has finished using the scanner.

Parameters:

scanner – The scanner.

int pj_scan_is_eof(const pj_scanner *scanner)

Determine whether the EOF condition for the scanner has been met.

Parameters:

scanner – The scanner.

Returns:

Non-zero if scanner is EOF.

int pj_scan_peek(pj_scanner *scanner, const pj_cis_t *spec, pj_str_t *out)

Peek strings in current position according to parameter spec, and return the strings in parameter out. The current scanner position will not be moved. If the scanner is already in EOF state, syntax error callback will be called thrown.

Parameters:
  • scanner – The scanner.

  • spec – The spec to match input string.

  • out – String to store the result.

Returns:

the character right after the peek-ed position or zero if there’s no more characters.

int pj_scan_peek_n(pj_scanner *scanner, pj_size_t len, pj_str_t *out)

Peek len characters in current position, and return them in out parameter. Note that whitespaces or newlines will be returned as it is, regardless of PJ_SCAN_AUTOSKIP_WS settings. If the character left is less than len, syntax error callback will be called.

Parameters:
  • scanner – The scanner.

  • len – Length to peek.

  • out – String to store the result.

Returns:

the character right after the peek-ed position or zero if there’s no more characters.

int pj_scan_peek_until(pj_scanner *scanner, const pj_cis_t *spec, pj_str_t *out)

Peek strings in current position until spec is matched, and return the strings in parameter out. The current scanner position will not be moved. If the scanner is already in EOF state, syntax error callback will be called.

Parameters:
  • scanner – The scanner.

  • spec – The peeking will stop when the input match this spec.

  • out – String to store the result.

Returns:

the character right after the peek-ed position.

void pj_scan_get(pj_scanner *scanner, const pj_cis_t *spec, pj_str_t *out)

Get characters from the buffer according to the spec, and return them in out parameter. The scanner will attempt to get as many characters as possible as long as the spec matches. If the first character doesn’t match the spec, or scanner is already in EOF when this function is called, an exception will be thrown.

Parameters:
  • scanner – The scanner.

  • spec – The spec to match input string.

  • out – String to store the result.

void pj_scan_get_unescape(pj_scanner *scanner, const pj_cis_t *spec, pj_str_t *out)

Just like pj_scan_get(), but additionally performs unescaping when escaped (‘’) character is found. The input spec MUST NOT contain the specification for ‘’ characted.

Parameters:
  • scanner – The scanner.

  • spec – The spec to match input string.

  • out – String to store the result.

void pj_scan_get_quote(pj_scanner *scanner, int begin_quote, int end_quote, pj_str_t *out)

Get characters between quotes. If current input doesn’t match begin_quote, syntax error will be thrown. Note that the resulting string will contain the enclosing quote.

Parameters:
  • scanner – The scanner.

  • begin_quote – The character to begin the quote.

  • end_quote – The character to end the quote.

  • out – String to store the result.

void pj_scan_get_quotes(pj_scanner *scanner, const char *begin_quotes, const char *end_quotes, int qsize, pj_str_t *out)

Get characters between quotes. If current input doesn’t match begin_quote, syntax error will be thrown. Note that the resulting string will contain the enclosing quote.

Parameters:
  • scanner – The scanner.

  • begin_quotes – The character array to begin the quotes. For example, the two characters “ and ‘.

  • end_quotes – The character array to end the quotes. The position found in the begin_quotes array will be used to match the end quotes. So if the begin_quotes was the array of “’< the end_quotes should be “’>. If begin_array matched the ‘ then the end_quotes will look for ‘ to match at the end.

  • qsize – The size of the begin_quotes and end_quotes arrays.

  • out – String to store the result.

void pj_scan_get_n(pj_scanner *scanner, unsigned N, pj_str_t *out)

Get N characters from the scanner.

Parameters:
  • scanner – The scanner.

  • N – Number of characters to get.

  • out – String to store the result.

int pj_scan_get_char(pj_scanner *scanner)

Get one character from the scanner.

Parameters:

scanner – The scanner.

Returns:

The character.

void pj_scan_get_until(pj_scanner *scanner, const pj_cis_t *spec, pj_str_t *out)

Get characters from the scanner and move the scanner position until the current character matches the spec.

Parameters:
  • scanner – The scanner.

  • spec – Get until the input match this spec.

  • out – String to store the result.

void pj_scan_get_until_ch(pj_scanner *scanner, int until_char, pj_str_t *out)

Get characters from the scanner and move the scanner position until the current character matches until_char.

Parameters:
  • scanner – The scanner.

  • until_char – Get until the input match this character.

  • out – String to store the result.

void pj_scan_get_until_chr(pj_scanner *scanner, const char *until_spec, pj_str_t *out)

Get characters from the scanner and move the scanner position until the current character matches until_char.

Parameters:
  • scanner – The scanner.

  • until_spec – Get until the input match any of these characters.

  • out – String to store the result.

void pj_scan_advance_n(pj_scanner *scanner, unsigned N, pj_bool_t skip)

Advance the scanner N characters, and skip whitespace if necessary.

Parameters:
  • scanner – The scanner.

  • N – Number of characters to skip.

  • skip – Flag to specify whether whitespace should be skipped after skipping the characters.

int pj_scan_strcmp(pj_scanner *scanner, const char *s, int len)

Compare string in current position with the specified string.

Parameters:
  • scanner – The scanner.

  • s – The string to compare with.

  • len – Length of the string to compare.

Returns:

zero, <0, or >0 (just like strcmp()).

int pj_scan_stricmp(pj_scanner *scanner, const char *s, int len)

Case-less string comparison of current position with the specified string.

Parameters:
  • scanner – The scanner.

  • s – The string to compare with.

  • len – Length of the string to compare with.

Returns:

zero, <0, or >0 (just like strcmp()).

int pj_scan_stricmp_alnum(pj_scanner *scanner, const char *s, int len)

Perform case insensitive string comparison of string in current position, knowing that the string to compare only consists of alphanumeric characters.

Note that unlike pj_scan_stricmp, this function can only return zero or -1.

Parameters:
  • scanner – The scanner.

  • s – The string to compare with.

  • len – Length of the string to compare with.

Returns:

zero if equal or -1.

void pj_scan_get_newline(pj_scanner *scanner)

Get a newline from the scanner. A newline is defined as ‘\n’, or ‘\r’, or “\r\n”. If current input is not newline, syntax error will be thrown.

Parameters:

scanner – The scanner.

void pj_scan_skip_whitespace(pj_scanner *scanner)

Manually skip whitespaces according to flag that was specified when the scanner was initialized.

Parameters:

scanner – The scanner.

void pj_scan_skip_line(pj_scanner *scanner)

Skip current line.

Parameters:

scanner – The scanner.

void pj_scan_save_state(const pj_scanner *scanner, pj_scan_state *state)

Save the full scanner state.

Parameters:
  • scanner – The scanner.

  • state – Variable to store scanner’s state.

void pj_scan_restore_state(pj_scanner *scanner, pj_scan_state *state)

Restore the full scanner state. Note that this would not restore the string if application has modified it. This will only restore the scanner scanning position.

Parameters:
  • scanner – The scanner.

  • state – State of the scanner.

int pj_scan_get_col(const pj_scanner *scanner)

Get current column position.

Parameters:

scanner – The scanner.

Returns:

The column position.

struct pj_scanner
#include <scanner.h>

The text scanner structure.

Public Members

char *begin

Start of input buffer.

char *end

End of input buffer.

char *curptr

Current pointer.

int line

Current line.

char *start_line

Where current line starts.

int skip_ws

Skip whitespace flag.

pj_syn_err_func_ptr callback

Syntax error callback.

struct pj_scan_state
#include <scanner.h>

This structure can be used by application to store the state of the parser, so that the scanner state can be rollback to this state when necessary.

Public Members

char *curptr

Current scanner’s pointer.

int line

Current line.

char *start_line

Start of current line.