Class CsvParserSettings
- java.lang.Object
-
- com.univocity.parsers.common.CommonSettings<F>
-
- com.univocity.parsers.common.CommonParserSettings<CsvFormat>
-
- com.univocity.parsers.csv.CsvParserSettings
-
- All Implemented Interfaces:
java.lang.Cloneable
public class CsvParserSettings extends CommonParserSettings<CsvFormat>
This is the configuration class used by the CSV parser (CsvParser
)In addition to the configuration options provided by
CommonParserSettings
, the CSVParserSettings include:- emptyValue (defaults to null): Defines a replacement string to signify an empty value (which is not a null value)
When reading, if the parser does not read any character from the input, and the input is within quotes, the empty is used instead of an empty string
- Author:
- uniVocity Software Pty Ltd - parsers@univocity.com
- See Also:
CsvParser
,CsvFormat
,CommonParserSettings
-
-
Field Summary
-
Fields inherited from class com.univocity.parsers.common.CommonParserSettings
headerExtractionEnabled
-
-
Constructor Summary
Constructors Constructor Description CsvParserSettings()
-
Method Summary
All Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description protected void
addConfiguration(java.util.Map<java.lang.String,java.lang.Object> out)
CsvParserSettings
clone()
Clones this configuration object.CsvParserSettings
clone(boolean clearInputSpecificSettings)
Clones this configuration object to reuse user-provided settings.protected CsvFormat
createDefaultFormat()
Returns the default CsvFormat configured to handle CSV inputs compliant to the RFC4180 standard.void
detectFormatAutomatically()
Convenience method to turn on all format detection features in a single method call, namely:setDelimiterDetectionEnabled(boolean)
setQuoteDetectionEnabled(boolean)
CommonParserSettings.setLineSeparatorDetectionEnabled(boolean)
java.lang.String
getEmptyValue()
Returns the String representation of an empty value (defaults to null)boolean
getKeepQuotes()
Flag indicating whether the parser should keep enclosing quote characters in the values parsed from the input.UnescapedQuoteHandling
getUnescapedQuoteHandling()
Returns the method of handling values with unescaped quotes.boolean
isDelimiterDetectionEnabled()
Returns a flag indicating whether the parser should analyze the input to discover the column delimiter character.boolean
isEscapeUnquotedValues()
Indicates whether escape sequences should be processed in unquoted values.boolean
isKeepEscapeSequences()
Indicates whether the parser should keep any escape sequences if they are present in the input (i.e.boolean
isNormalizeLineEndingsWithinQuotes()
Flag indicating whether the parser should replace line separators, specified inFormat.getLineSeparator()
by the normalized line separator character specified inFormat.getNormalizedNewline()
, even on quoted values.boolean
isParseUnescapedQuotes()
Deprecated.usegetUnescapedQuoteHandling()
instead.boolean
isParseUnescapedQuotesUntilDelimiter()
Deprecated.usegetUnescapedQuoteHandling()
instead.boolean
isQuoteDetectionEnabled()
Returns a flag indicating whether the parser should analyze the input to discover the quote character.protected CharAppender
newCharAppender()
Returns an instance of CharAppender with the configured limit of maximum characters per column and the default value used to represent an empty value (when the String parsed from the input, within quotes, is empty)void
setDelimiterDetectionEnabled(boolean separatorDetectionEnabled)
Configures the parser to analyze the input before parsing to discover the column delimiter character.void
setEmptyValue(java.lang.String emptyValue)
Sets the String representation of an empty value (defaults to null)void
setEscapeUnquotedValues(boolean escapeUnquotedValues)
Configures the parser to process escape sequences in unquoted values.void
setKeepEscapeSequences(boolean keepEscapeSequences)
Configures the parser to keep any escape sequences if they are present in the input (i.e.void
setKeepQuotes(boolean keepQuotes)
Configures the parser to keep enclosing quote characters in the values parsed from the input.void
setNormalizeLineEndingsWithinQuotes(boolean normalizeLineEndingsWithinQuotes)
Configures the parser to replace line separators, specified inFormat.getLineSeparator()
by the normalized line separator character specified inFormat.getNormalizedNewline()
, even on quoted values.void
setParseUnescapedQuotes(boolean parseUnescapedQuotes)
Deprecated.usesetUnescapedQuoteHandling(UnescapedQuoteHandling)
instead.void
setParseUnescapedQuotesUntilDelimiter(boolean parseUnescapedQuotesUntilDelimiter)
Deprecated.usesetUnescapedQuoteHandling(UnescapedQuoteHandling)
instead.void
setQuoteDetectionEnabled(boolean quoteDetectionEnabled)
Configures the parser to analyze the input before parsing to discover the quote character.void
setUnescapedQuoteHandling(UnescapedQuoteHandling unescapedQuoteHandling)
Configures the handling of values with unescaped quotes.-
Methods inherited from class com.univocity.parsers.common.CommonParserSettings
clearInputSpecificSettings, configureFromAnnotations, getInputBufferSize, getNumberOfRecordsToRead, getNumberOfRowsToSkip, getProcessor, getReadInputOnSeparateThread, getRowProcessor, isColumnReorderingEnabled, isCommentCollectionEnabled, isHeaderExtractionEnabled, isLineSeparatorDetectionEnabled, newCharInputReader, setColumnReorderingEnabled, setCommentCollectionEnabled, setHeaderExtractionEnabled, setInputBufferSize, setLineSeparatorDetectionEnabled, setNumberOfRecordsToRead, setNumberOfRowsToSkip, setProcessor, setReadInputOnSeparateThread, setRowProcessor
-
Methods inherited from class com.univocity.parsers.common.CommonSettings
excludeFields, excludeFields, excludeIndexes, getErrorContentLength, getFormat, getHeaders, getIgnoreLeadingWhitespaces, getIgnoreTrailingWhitespaces, getMaxCharsPerColumn, getMaxColumns, getNullValue, getProcessorErrorHandler, getRowProcessorErrorHandler, getSkipBitsAsWhitespace, getSkipEmptyLines, getWhitespaceRangeStart, isAutoConfigurationEnabled, isProcessorErrorHandlerDefined, selectFields, selectFields, selectIndexes, setAutoConfigurationEnabled, setErrorContentLength, setFormat, setHeaders, setIgnoreLeadingWhitespaces, setIgnoreTrailingWhitespaces, setMaxCharsPerColumn, setMaxColumns, setNullValue, setProcessorErrorHandler, setRowProcessorErrorHandler, setSkipBitsAsWhitespace, setSkipEmptyLines, toString, trimValues
-
-
-
-
Method Detail
-
getEmptyValue
public java.lang.String getEmptyValue()
Returns the String representation of an empty value (defaults to null)When reading, if the parser does not read any character from the input, and the input is within quotes, the empty is used instead of an empty string
- Returns:
- the String representation of an empty value
-
setEmptyValue
public void setEmptyValue(java.lang.String emptyValue)
Sets the String representation of an empty value (defaults to null)When reading, if the parser does not read any character from the input, and the input is within quotes, the empty is used instead of an empty string
- Parameters:
emptyValue
- the String representation of an empty value
-
newCharAppender
protected CharAppender newCharAppender()
Returns an instance of CharAppender with the configured limit of maximum characters per column and the default value used to represent an empty value (when the String parsed from the input, within quotes, is empty)This overrides the parent's version because the CSV parser does not rely on the appender to identify null values, but on the other hand, the appender is required to identify empty values.
- Overrides:
newCharAppender
in classCommonParserSettings<CsvFormat>
- Returns:
- an instance of CharAppender with the configured limit of maximum characters per column and the default value used to represent an empty value (when the String parsed from the input, within quotes, is empty)
-
createDefaultFormat
protected CsvFormat createDefaultFormat()
Returns the default CsvFormat configured to handle CSV inputs compliant to the RFC4180 standard.- Specified by:
createDefaultFormat
in classCommonSettings<CsvFormat>
- Returns:
- and instance of CsvFormat configured to handle CSV inputs compliant to the RFC4180 standard.
-
isParseUnescapedQuotes
@Deprecated public boolean isParseUnescapedQuotes()
Deprecated.usegetUnescapedQuoteHandling()
instead. The configuration returned bygetUnescapedQuoteHandling()
will override this setting if not null.Indicates whether the CSV parser should accept unescaped quotes inside quoted values and parse them normally. Defaults totrue
.- Returns:
- a flag indicating whether or not the CSV parser should accept unescaped quotes inside quoted values.
-
setParseUnescapedQuotes
@Deprecated public void setParseUnescapedQuotes(boolean parseUnescapedQuotes)
Deprecated.usesetUnescapedQuoteHandling(UnescapedQuoteHandling)
instead. The configuration returned bygetUnescapedQuoteHandling()
will override this setting if not null.Configures how to handle unescaped quotes inside quoted values. If set totrue
, the parser will parse the quote normally as part of the value. If set thefalse
, aTextParsingException
will be thrown. Defaults totrue
.- Parameters:
parseUnescapedQuotes
- indicates whether or not the CSV parser should accept unescaped quotes inside quoted values.
-
setParseUnescapedQuotesUntilDelimiter
@Deprecated public void setParseUnescapedQuotesUntilDelimiter(boolean parseUnescapedQuotesUntilDelimiter)
Deprecated.usesetUnescapedQuoteHandling(UnescapedQuoteHandling)
instead. The configuration returned bygetUnescapedQuoteHandling()
will override this setting if not null.Configures the parser to process values with unescaped quotes, and stop accumulating characters and consider the value parsed when a delimiter is found. (defaults totrue
)- Parameters:
parseUnescapedQuotesUntilDelimiter
- a flag indicating that the parser should stop accumulating values when a field delimiter character is found when parsing unquoted and unescaped values.
-
isParseUnescapedQuotesUntilDelimiter
@Deprecated public boolean isParseUnescapedQuotesUntilDelimiter()
Deprecated.usegetUnescapedQuoteHandling()
instead. The configuration returned bygetUnescapedQuoteHandling()
will override this setting if not null.When parsing unescaped quotes, indicates the parser should stop accumulating characters and consider the value parsed when a delimiter is found. (defaults totrue
)- Returns:
- a flag indicating that the parser should stop accumulating values when a field delimiter character is found when parsing unquoted and unescaped values.
-
isEscapeUnquotedValues
public boolean isEscapeUnquotedValues()
Indicates whether escape sequences should be processed in unquoted values. Defaults tofalse
.By default, this is disabled and if the input is
A""B,C
, the resulting value will be[A""B] and [C]
(i.e. the content is read as-is). However, if the parser is configured to process escape sequences in unquoted values, the result will be[A"B] and [C]
- Returns:
- true if escape sequences should be processed in unquoted values, otherwise false
-
setEscapeUnquotedValues
public void setEscapeUnquotedValues(boolean escapeUnquotedValues)
Configures the parser to process escape sequences in unquoted values. Defaults tofalse
.By default, this is disabled and if the input is
A""B,C
, the resulting value will be[A""B] and [C]
(i.e. the content is read as-is). However, if the parser is configured to process escape sequences in unquoted values, the result will be[A"B] and [C]
- Parameters:
escapeUnquotedValues
- a flag indicating whether escape sequences should be processed in unquoted values
-
isKeepEscapeSequences
public final boolean isKeepEscapeSequences()
Indicates whether the parser should keep any escape sequences if they are present in the input (i.e. a quote escape sequence such as two double quotes""
won't be replaced by a single double quote"
).This is disabled by default
- Returns:
- a flag indicating whether escape sequences should be kept (and not replaced) by the parser.
-
setKeepEscapeSequences
public final void setKeepEscapeSequences(boolean keepEscapeSequences)
Configures the parser to keep any escape sequences if they are present in the input (i.e. a quote escape sequence such as 2 double quotes""
won't be replaced by a single double quote"
).This is disabled by default
- Parameters:
keepEscapeSequences
- the flag indicating whether escape sequences should be kept (and not replaced) by the parser.
-
isDelimiterDetectionEnabled
public final boolean isDelimiterDetectionEnabled()
Returns a flag indicating whether the parser should analyze the input to discover the column delimiter character.Note that the detection process is not guaranteed to discover the correct column delimiter. In this case the delimiter provided by
CsvFormat.getDelimiter()
will be used- Returns:
- a flag indicating whether the parser should analyze the input to discover the column delimiter character.
-
setDelimiterDetectionEnabled
public final void setDelimiterDetectionEnabled(boolean separatorDetectionEnabled)
Configures the parser to analyze the input before parsing to discover the column delimiter character.Note that the detection process is not guaranteed to discover the correct column delimiter. In this case the delimiter provided by
CsvFormat.getDelimiter()
will be used- Parameters:
separatorDetectionEnabled
- the flag to enable/disable discovery of the column delimiter character.
-
isQuoteDetectionEnabled
public final boolean isQuoteDetectionEnabled()
Returns a flag indicating whether the parser should analyze the input to discover the quote character. The quote escape will also be detected as part of this process.Note that the detection process is not guaranteed to discover the correct quote & escape. In this case the characters provided by
CsvFormat.getQuote()
andCsvFormat.getQuoteEscape()
will be used- Returns:
- a flag indicating whether the parser should analyze the input to discover the quote character. The quote escape will also be detected as part of this process.
-
setQuoteDetectionEnabled
public final void setQuoteDetectionEnabled(boolean quoteDetectionEnabled)
Configures the parser to analyze the input before parsing to discover the quote character. The quote escape will also be detected as part of this process.Note that the detection process is not guaranteed to discover the correct quote & escape. In this case the characters provided by
CsvFormat.getQuote()
andCsvFormat.getQuoteEscape()
will be used- Parameters:
quoteDetectionEnabled
- the flag to enable/disable discovery of the quote character. The quote escape will also be detected as part of this process.
-
detectFormatAutomatically
public final void detectFormatAutomatically()
Convenience method to turn on all format detection features in a single method call, namely:
-
isNormalizeLineEndingsWithinQuotes
public boolean isNormalizeLineEndingsWithinQuotes()
Flag indicating whether the parser should replace line separators, specified inFormat.getLineSeparator()
by the normalized line separator character specified inFormat.getNormalizedNewline()
, even on quoted values. This is enabled by default and is used to ensure data be read on any platform without introducing unwanted blank lines. For example, consider the quoted value"Line1 \r\n Line2"
. If this is parsed using"\r\n"
as the line separator sequence, and the normalized new line is set to'\n'
(the default), the output will be:[Line1 \n Line2]
However, if the value is meant to be kept untouched, and the original line separator should be maintained, set thenormalizeLineEndingsWithinQuotes
tofalse
. This will make the parser read the value as-is, producing:[Line1 \r\n Line2]
- Returns:
true
if line separators in quoted values will be normalized,false
otherwise
-
setNormalizeLineEndingsWithinQuotes
public void setNormalizeLineEndingsWithinQuotes(boolean normalizeLineEndingsWithinQuotes)
Configures the parser to replace line separators, specified inFormat.getLineSeparator()
by the normalized line separator character specified inFormat.getNormalizedNewline()
, even on quoted values. This is enabled by default and is used to ensure data be read on any platform without introducing unwanted blank lines. For example, consider the quoted value"Line1 \r\n Line2"
. If this is parsed using"\r\n"
as the line separator sequence, and the normalized new line is set to'\n'
(the default), the output will be:[Line1 \n Line2]
However, if the value is meant to be kept untouched, and the original line separator should be maintained, set thenormalizeLineEndingsWithinQuotes
tofalse
. This will make the parser read the value as-is, producing:[Line1 \r\n Line2]
- Parameters:
normalizeLineEndingsWithinQuotes
- flag indicating whether line separators in quoted values should be replaced by the the character specified inFormat.getNormalizedNewline()
.
-
setUnescapedQuoteHandling
public void setUnescapedQuoteHandling(UnescapedQuoteHandling unescapedQuoteHandling)
Configures the handling of values with unescaped quotes. Defaults tonull
, for backward compatibility withisParseUnescapedQuotes()
andisParseUnescapedQuotesUntilDelimiter()
. If set to a non-null value, this setting will override the configuration ofisParseUnescapedQuotes()
andisParseUnescapedQuotesUntilDelimiter()
.- Parameters:
unescapedQuoteHandling
- the handling method to be used when unescaped quotes are found in the input.
-
getUnescapedQuoteHandling
public UnescapedQuoteHandling getUnescapedQuoteHandling()
Returns the method of handling values with unescaped quotes. Defaults tonull
, for backward compatibility withisParseUnescapedQuotes()
andisParseUnescapedQuotesUntilDelimiter()
If set to a non-null value, this setting will override the configuration ofisParseUnescapedQuotes()
andisParseUnescapedQuotesUntilDelimiter()
.- Returns:
- the handling method to be used when unescaped quotes are found in the input, or
null
if not set.
-
getKeepQuotes
public boolean getKeepQuotes()
Flag indicating whether the parser should keep enclosing quote characters in the values parsed from the input.Defaults to
false
- Returns:
- a flag indicating whether enclosing quotes should be maintained when parsing quoted values.
-
setKeepQuotes
public void setKeepQuotes(boolean keepQuotes)
Configures the parser to keep enclosing quote characters in the values parsed from the input.Defaults to
false
- Parameters:
keepQuotes
- flag indicating whether enclosing quotes should be maintained when parsing quoted values.
-
addConfiguration
protected void addConfiguration(java.util.Map<java.lang.String,java.lang.Object> out)
- Overrides:
addConfiguration
in classCommonParserSettings<CsvFormat>
-
clone
public final CsvParserSettings clone()
Description copied from class:CommonSettings
Clones this configuration object. Use alternativeCommonSettings.clone(boolean)
method to reset properties that are specific to a given input, such as header names and selection of fields.- Overrides:
clone
in classCommonParserSettings<CsvFormat>
- Returns:
- a copy of all configurations applied to the current instance.
-
clone
public final CsvParserSettings clone(boolean clearInputSpecificSettings)
Description copied from class:CommonSettings
Clones this configuration object to reuse user-provided settings. Properties that are specific to a given input (such as header names and selection of fields) can be reset to their defaults if theclearInputSpecificSettings
flag is set totrue
- Overrides:
clone
in classCommonParserSettings<CsvFormat>
- Parameters:
clearInputSpecificSettings
- flag indicating whether to clear settings that are likely to be associated with a given input.- Returns:
- a copy of the configurations applied to the current instance.
-
-