Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
5a5b347
add original changes.
lucaionescu Dec 18, 2019
18bd98f
ENH: Add support for DataFrame(Categorical) (#11363) (#30305)
proost Dec 18, 2019
416907d
DOC: whatsnew fixups (#30331)
TomAugspurger Dec 18, 2019
f36eac1
CLN: changed .format to f-string in pandas/core/dtypes (#30287)
DorAmram Dec 18, 2019
70a083f
Fix typos, via a Levenshtein-style corrector (#30341)
bwignall Dec 19, 2019
20e4c18
TYPING: Enable --check-untyped-defs for MyPy (#29493)
simonjayhawkins Dec 19, 2019
53a0dfd
BUG: Fix infer_dtype_from_scalar to infer IntervalDtype (#30339)
jschendel Dec 19, 2019
5b25df2
API: Return BoolArray for string ops when backed by StringArray (#30239)
TomAugspurger Dec 19, 2019
f8b9ce7
REF: change parameter name fname -> path (#30338)
jbrockmendel Dec 19, 2019
8cbfd06
CLN: make lookups explicit instead of using globals (#30343)
jbrockmendel Dec 19, 2019
2bfd10c
REF: remove pytables Table.metadata (#30342)
jbrockmendel Dec 19, 2019
95e1a63
REF: pytables prepare to make _create_axes return a new object (#30344)
jbrockmendel Dec 19, 2019
e66a2c7
CLN: format replaced with f-strings #29547 (#30355)
hasnain2808 Dec 19, 2019
011a667
replace str.format with f-string (#30363)
AlpAribal Dec 20, 2019
c521a4e
DOC: "Next" link from user_guide/io.rst goes to read_sql_table API pa…
souvik3333 Dec 20, 2019
b4343ef
CI: troubleshoot codecov (#30070)
jbrockmendel Dec 20, 2019
66038e9
BUG+TST: non-optimized apply_index and empty DatetimeIndex (#30336)
jbrockmendel Dec 20, 2019
a9e2566
REF: define NA_VALUES in libparsers (#30373)
jbrockmendel Dec 20, 2019
eadaa40
[CLN] remove now-unnecessary td.skip_if_no(pathlib) (#30376)
MarcoGorelli Dec 20, 2019
1be80ea
REF: directory for method-specific series/frame tests (#30362)
jbrockmendel Dec 20, 2019
a6b047a
REF: refactor cumulative op tests from test_analytics (#30358)
jbrockmendel Dec 20, 2019
9296849
Cleaned up Tempita refs and Cython import (#30330)
WillAyd Dec 20, 2019
6efc237
CLN: Old string formatting: .format() -> f"" (#30328)
baevpetr Dec 20, 2019
0df8858
de-privatize io.common functions (#30368)
jbrockmendel Dec 20, 2019
0cd388f
CLN: remove py2-legacy UnicodeReader, UnicodeWriter (#30371)
jbrockmendel Dec 20, 2019
8376067
CI: troubleshoot codecov (#30380)
jbrockmendel Dec 21, 2019
c869255
CLN: move code out of try clause in merge.py (#30382)
topper-123 Dec 21, 2019
477b2d5
TYP: Annotations in core/indexes/ (#30390)
ShaharNaveh Dec 21, 2019
835f207
DOC: fix external links + favicon (#30389)
jorisvandenbossche Dec 22, 2019
a2bbdb5
STY: Underscores for long numbers (#30397)
ShaharNaveh Dec 22, 2019
104fc11
fix call of tm.assert_frame_equal
lucaionescu Dec 22, 2019
97b182b
add original changes.
lucaionescu Dec 18, 2019
3c8f95b
fix call of tm.assert_frame_equal
lucaionescu Dec 22, 2019
df2671b
merge.
lucaionescu Dec 22, 2019
f46426e
Revert "fix call of tm.assert_frame_equal"
lucaionescu Dec 22, 2019
0be5dd7
Revert "merge."
lucaionescu Dec 22, 2019
16217f0
fix tm.assert_frame_equal. use naming conventions.
lucaionescu Dec 22, 2019
640f729
sort imports correctly.
lucaionescu Dec 22, 2019
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Revert "merge."
This reverts commit df2671b, reversing
changes made to 104fc11.
  • Loading branch information
lucaionescu committed Dec 22, 2019
commit 0be5dd79c653d77a38f9f52bfec0bb7aca132d85
4 changes: 2 additions & 2 deletions ci/run_tests.sh
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,6 @@ sh -c "$PYTEST_CMD"

if [[ "$COVERAGE" && $? == 0 && "$TRAVIS_BRANCH" == "master" ]]; then
echo "uploading coverage"
echo "bash <(curl -s https://codecov.io/bash) -Z -c -f $COVERAGE_FNAME"
bash <(curl -s https://codecov.io/bash) -Z -c -f $COVERAGE_FNAME
echo "bash <(curl -s https://codecov.io/bash) -Z -c -F $TYPE -f $COVERAGE_FNAME"
bash <(curl -s https://codecov.io/bash) -Z -c -F $TYPE -f $COVERAGE_FNAME
fi
Binary file added doc/source/_static/favicon.ico
Binary file not shown.
8 changes: 2 additions & 6 deletions doc/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -204,11 +204,7 @@
# Theme options are theme-specific and customize the look and feel of a theme
# further. For a list of options available for each theme, see the
# documentation.
html_theme_options = {
"external_links": [],
"github_url": "https://github.com/pandas-dev/pandas",
"twitter_url": "https://twitter.com/pandas_dev",
}
# html_theme_options = {}

# Add any paths that contain custom themes here, relative to this directory.
# html_theme_path = ["themes"]
Expand All @@ -232,7 +228,7 @@
# The name of an image file (within the static path) to use as favicon of the
# docs. This file should be a Windows icon file (.ico) being 16x16 or 32x32
# pixels large.
html_favicon = "../../web/pandas/static/img/favicon.ico"
html_favicon = os.path.join(html_static_path[0], "favicon.ico")

# If not '', a 'Last updated on:' timestamp is inserted at every page bottom,
# using the given strftime format.
Expand Down
5 changes: 3 additions & 2 deletions doc/source/user_guide/io.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4763,10 +4763,10 @@ Parquet supports partitioning of data based on the values of one or more columns
.. ipython:: python

df = pd.DataFrame({'a': [0, 0, 1, 1], 'b': [0, 1, 0, 1]})
df.to_parquet(path='test', engine='pyarrow',
df.to_parquet(fname='test', engine='pyarrow',
partition_cols=['a'], compression=None)

The `path` specifies the parent directory to which data will be saved.
The `fname` specifies the parent directory to which data will be saved.
The `partition_cols` are the column names by which the dataset will be partitioned.
Columns are partitioned in the order they are given. The partition splits are
determined by the unique values in the partition columns.
Expand Down Expand Up @@ -4828,6 +4828,7 @@ See also some :ref:`cookbook examples <cookbook.sql>` for some advanced strategi
The key functions are:

.. autosummary::
:toctree: ../reference/api/

read_sql_table
read_sql_query
Expand Down
9 changes: 1 addition & 8 deletions doc/source/user_guide/text.rst
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,6 @@ These are places where the behavior of ``StringDtype`` objects differ from
l. For ``StringDtype``, :ref:`string accessor methods<api.series.str>`
that return **numeric** output will always return a nullable integer dtype,
rather than either int or float dtype, depending on the presence of NA values.
Methods returning **boolean** output will return a nullable boolean dtype.

.. ipython:: python

Expand All @@ -90,13 +89,7 @@ l. For ``StringDtype``, :ref:`string accessor methods<api.series.str>`
s.astype(object).str.count("a")
s.astype(object).dropna().str.count("a")

When NA values are present, the output dtype is float64. Similarly for
methods returning boolean values.

.. ipython:: python

s.str.isdigit()
s.str.match("a")
When NA values are present, the output dtype is float64.

2. Some string methods, like :meth:`Series.str.decode` are not available
on ``StringArray`` because ``StringArray`` only holds strings, not
Expand Down
16 changes: 7 additions & 9 deletions doc/source/whatsnew/v1.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -206,7 +206,6 @@ Other enhancements
now preserve those data types with pyarrow >= 1.0.0 (:issue:`20612`).
- The ``partition_cols`` argument in :meth:`DataFrame.to_parquet` now accepts a string (:issue:`27117`)
- :func:`to_parquet` now appropriately handles the ``schema`` argument for user defined schemas in the pyarrow engine. (:issue: `30270`)
- DataFrame constructor preserve `ExtensionArray` dtype with `ExtensionArray` (:issue:`11363`)


Build Changes
Expand Down Expand Up @@ -255,10 +254,10 @@ To update, use ``MultiIndex.set_names``, which returns a new ``MultiIndex``.
mi2 = mi.set_names("new name", level=0)
mi2.names

New repr for :class:`~pandas.arrays.IntervalArray`
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
New repr for :class:`pandas.core.arrays.IntervalArray`
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

- :class:`pandas.arrays.IntervalArray` adopts a new ``__repr__`` in accordance with other array classes (:issue:`25022`)
- :class:`pandas.core.arrays.IntervalArray` adopts a new ``__repr__`` in accordance with other array classes (:issue:`25022`)

*pandas 0.25.x*

Expand Down Expand Up @@ -502,7 +501,6 @@ Deprecations
- :func:`pandas.json_normalize` is now exposed in the top-level namespace.
Usage of ``json_normalize`` as ``pandas.io.json.json_normalize`` is now deprecated and
it is recommended to use ``json_normalize`` as :func:`pandas.json_normalize` instead (:issue:`27586`).
- :meth:`DataFrame.to_stata`, :meth:`DataFrame.to_feather`, and :meth:`DataFrame.to_parquet` argument "fname" is deprecated, use "path" instead (:issue:`23574`)
- The ``numpy`` argument of :meth:`pandas.read_json` is deprecated (:issue:`28512`).
-

Expand Down Expand Up @@ -580,7 +578,7 @@ or ``matplotlib.Axes.plot``. See :ref:`plotting.formatters` for more.
- :meth:`Series.where` with ``Categorical`` dtype (or :meth:`DataFrame.where` with ``Categorical`` column) no longer allows setting new categories (:issue:`24114`)
- :class:`DatetimeIndex`, :class:`TimedeltaIndex`, and :class:`PeriodIndex` constructors no longer allow ``start``, ``end``, and ``periods`` keywords, use :func:`date_range`, :func:`timedelta_range`, and :func:`period_range` instead (:issue:`23919`)
- :class:`DatetimeIndex` and :class:`TimedeltaIndex` constructors no longer have a ``verify_integrity`` keyword argument (:issue:`23919`)
- ``pandas.core.internals.blocks.make_block`` no longer accepts the "fastpath" keyword(:issue:`19265`)
- :func:`core.internals.blocks.make_block` no longer accepts the "fastpath" keyword(:issue:`19265`)
- :meth:`Block.make_block_same_class` no longer accepts the "dtype" keyword(:issue:`19434`)
- Removed the previously deprecated :meth:`ExtensionArray._formatting_values`. Use :attr:`ExtensionArray._formatter` instead. (:issue:`23601`)
- Removed the previously deprecated :meth:`MultiIndex.to_hierarchical` (:issue:`21613`)
Expand Down Expand Up @@ -657,7 +655,7 @@ Performance improvements
~~~~~~~~~~~~~~~~~~~~~~~~

- Performance improvement in indexing with a non-unique :class:`IntervalIndex` (:issue:`27489`)
- Performance improvement in :attr:`MultiIndex.is_monotonic` (:issue:`27495`)
- Performance improvement in `MultiIndex.is_monotonic` (:issue:`27495`)
- Performance improvement in :func:`cut` when ``bins`` is an :class:`IntervalIndex` (:issue:`27668`)
- Performance improvement when initializing a :class:`DataFrame` using a ``range`` (:issue:`30171`)
- Performance improvement in :meth:`DataFrame.corr` when ``method`` is ``"spearman"`` (:issue:`28139`)
Expand Down Expand Up @@ -713,7 +711,7 @@ Datetimelike
- Bug in :func:`pandas.to_datetime` when called with ``None`` raising ``TypeError`` instead of returning ``NaT`` (:issue:`30011`)
- Bug in :func:`pandas.to_datetime` failing for `deques` when using ``cache=True`` (the default) (:issue:`29403`)
- Bug in :meth:`Series.item` with ``datetime64`` or ``timedelta64`` dtype, :meth:`DatetimeIndex.item`, and :meth:`TimedeltaIndex.item` returning an integer instead of a :class:`Timestamp` or :class:`Timedelta` (:issue:`30175`)
- Bug in :class:`DatetimeIndex` addition when adding a non-optimized :class:`DateOffset` incorrectly dropping timezone information (:issue:`30336`)
-

Timedelta
^^^^^^^^^
Expand Down Expand Up @@ -760,7 +758,7 @@ Interval
^^^^^^^^

- Bug in :meth:`IntervalIndex.get_indexer` where a :class:`Categorical` or :class:`CategoricalIndex` ``target`` would incorrectly raise a ``TypeError`` (:issue:`30063`)
- Bug in ``pandas.core.dtypes.cast.infer_dtype_from_scalar`` where passing ``pandas_dtype=True`` did not infer :class:`IntervalDtype` (:issue:`30337`)
-

Indexing
^^^^^^^^
Expand Down
4 changes: 2 additions & 2 deletions pandas/_config/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,10 +57,10 @@
DeprecatedOption = namedtuple("DeprecatedOption", "key msg rkey removal_ver")
RegisteredOption = namedtuple("RegisteredOption", "key defval doc validator cb")

# holds deprecated option metadata
# holds deprecated option metdata
_deprecated_options: Dict[str, DeprecatedOption] = {}

# holds registered option metadata
# holds registered option metdata
_registered_options: Dict[str, RegisteredOption] = {}

# holds the current values for registered options
Expand Down
2 changes: 1 addition & 1 deletion pandas/_libs/groupby.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -791,7 +791,7 @@ def group_quantile(ndarray[float64_t] out,
out[i] = NaN
else:
# Calculate where to retrieve the desired value
# Casting to int will intentionally truncate result
# Casting to int will intentionaly truncate result
idx = grp_start + <int64_t>(q * <float64_t>(non_na_sz - 1))

val = values[sort_arr[idx]]
Expand Down
2 changes: 1 addition & 1 deletion pandas/_libs/index.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -288,7 +288,7 @@ cdef class IndexEngine:

def get_indexer_non_unique(self, targets):
"""
Return an indexer suitable for taking from a non unique index
Return an indexer suitable for takng from a non unique index
return the labels in the same order ast the target
and a missing indexer into the targets (which correspond
to the -1 indices in the results
Expand Down
2 changes: 1 addition & 1 deletion pandas/_libs/lib.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -510,7 +510,7 @@ def maybe_booleans_to_slice(ndarray[uint8_t] mask):
@cython.boundscheck(False)
def array_equivalent_object(left: object[:], right: object[:]) -> bool:
"""
Perform an element by element comparison on 1-d object arrays
Perform an element by element comparion on 1-d object arrays
taking into account nan positions.
"""
cdef:
Expand Down
21 changes: 1 addition & 20 deletions pandas/_libs/parsers.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -1367,26 +1367,7 @@ def _ensure_encoded(list lst):
# common NA values
# no longer excluding inf representations
# '1.#INF','-1.#INF', '1.#INF000000',
STR_NA_VALUES = {
"-1.#IND",
"1.#QNAN",
"1.#IND",
"-1.#QNAN",
"#N/A N/A",
"#N/A",
"N/A",
"n/a",
"NA",
"#NA",
"NULL",
"null",
"NaN",
"-NaN",
"nan",
"-nan",
"",
}
_NA_VALUES = _ensure_encoded(list(STR_NA_VALUES))
_NA_VALUES = _ensure_encoded(list(icom._NA_VALUES))


def _maybe_upcast(arr):
Expand Down
2 changes: 1 addition & 1 deletion pandas/_libs/src/klib/khash.h
Original file line number Diff line number Diff line change
Expand Up @@ -498,7 +498,7 @@ PANDAS_INLINE khint_t __ac_Wang_hash(khint_t key)
*/
#define kh_n_buckets(h) ((h)->n_buckets)

/* More convenient interfaces */
/* More conenient interfaces */

/*! @function
@abstract Instantiate a hash set containing integer keys
Expand Down
6 changes: 3 additions & 3 deletions pandas/_libs/src/ujson/lib/ultrajsondec.c
Original file line number Diff line number Diff line change
Expand Up @@ -150,7 +150,7 @@ FASTCALL_ATTR JSOBJ FASTCALL_MSVC decode_numeric(struct DecoderState *ds) {
case '7':
case '8':
case '9': {
// FIXME: Check for arithmetic overflow here
// FIXME: Check for arithemtic overflow here
// PERF: Don't do 64-bit arithmetic here unless we know we have
// to
intValue = intValue * 10ULL + (JSLONG)(chr - 48);
Expand Down Expand Up @@ -235,7 +235,7 @@ FASTCALL_ATTR JSOBJ FASTCALL_MSVC decode_numeric(struct DecoderState *ds) {
}

BREAK_FRC_LOOP:
// FIXME: Check for arithmetic overflow here
// FIXME: Check for arithemtic overflow here
ds->lastType = JT_DOUBLE;
ds->start = offset;
return ds->dec->newDouble(
Expand Down Expand Up @@ -282,7 +282,7 @@ FASTCALL_ATTR JSOBJ FASTCALL_MSVC decode_numeric(struct DecoderState *ds) {
}

BREAK_EXP_LOOP:
// FIXME: Check for arithmetic overflow here
// FIXME: Check for arithemtic overflow here
ds->lastType = JT_DOUBLE;
ds->start = offset;
return ds->dec->newDouble(
Expand Down
2 changes: 1 addition & 1 deletion pandas/_libs/src/ujson/python/objToJSON.c
Original file line number Diff line number Diff line change
Expand Up @@ -1632,7 +1632,7 @@ char **NpyArr_encodeLabels(PyArrayObject *labels, PyObjectEncoder *enc,
sprintf(buf, "%" NPY_INT64_FMT, value);
len = strlen(cLabel);
}
} else { // Fallback to string representation
} else { // Fallack to string representation
PyObject *str = PyObject_Str(item);
if (str == NULL) {
Py_DECREF(item);
Expand Down
27 changes: 8 additions & 19 deletions pandas/_libs/tslibs/timestamps.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -324,7 +324,7 @@ class Timestamp(_Timestamp):

Function is not implemented. Use pd.to_datetime().
"""
raise NotImplementedError("Timestamp.strptime() is not implemented."
raise NotImplementedError("Timestamp.strptime() is not implmented."
"Use to_datetime() to parse date strings.")

@classmethod
Expand All @@ -336,22 +336,11 @@ class Timestamp(_Timestamp):
"""
return cls(datetime.combine(date, time))

def __new__(
cls,
object ts_input=_no_input,
object freq=None,
tz=None,
unit=None,
year=None,
month=None,
day=None,
hour=None,
minute=None,
second=None,
microsecond=None,
nanosecond=None,
tzinfo=None
):
def __new__(cls, object ts_input=_no_input,
object freq=None, tz=None, unit=None,
year=None, month=None, day=None,
hour=None, minute=None, second=None, microsecond=None,
nanosecond=None, tzinfo=None):
# The parameter list folds together legacy parameter names (the first
# four) and positional and keyword parameter names from pydatetime.
#
Expand Down Expand Up @@ -412,8 +401,8 @@ class Timestamp(_Timestamp):
freq = None

if getattr(ts_input, 'tzinfo', None) is not None and tz is not None:
raise ValueError("Cannot pass a datetime or Timestamp with tzinfo with "
"the tz parameter. Use tz_convert instead.")
raise ValueError("Cannot pass a datetime or Timestamp with tzinfo with the"
" tz parameter. Use tz_convert instead.")

ts = convert_to_tsobject(ts_input, tz, unit, 0, 0, nanosecond or 0)

Expand Down
7 changes: 3 additions & 4 deletions pandas/core/arrays/datetimes.py
Original file line number Diff line number Diff line change
Expand Up @@ -794,17 +794,16 @@ def _add_offset(self, offset):
values = self.tz_localize(None)
else:
values = self
result = offset.apply_index(values).tz_localize(self.tz)
result = offset.apply_index(values)
if self.tz is not None:
result = result.tz_localize(self.tz)

except NotImplementedError:
warnings.warn(
"Non-vectorized DateOffset being applied to Series or DatetimeIndex",
PerformanceWarning,
)
result = self.astype("O") + offset
if len(self) == 0:
# _from_sequence won't be able to infer self.tz
return type(self)._from_sequence(result).tz_localize(self.tz)

return type(self)._from_sequence(result, freq="infer")

Expand Down
2 changes: 1 addition & 1 deletion pandas/core/arrays/sparse/dtype.py
Original file line number Diff line number Diff line change
Expand Up @@ -290,7 +290,7 @@ def update_dtype(self, dtype):
Returns
-------
SparseDtype
A new SparseDtype with the correct `dtype` and fill value
A new SparseDtype with the corret `dtype` and fill value
for that `dtype`.

Raises
Expand Down
4 changes: 2 additions & 2 deletions pandas/core/arrays/string_.py
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ def __from_arrow__(self, array):

results = []
for arr in chunks:
# using _from_sequence to ensure None is converted to NA
# using _from_sequence to ensure None is convered to NA
str_arr = StringArray._from_sequence(np.array(arr))
results.append(str_arr)

Expand Down Expand Up @@ -153,7 +153,7 @@ class StringArray(PandasArray):
...
ValueError: StringArray requires an object-dtype ndarray of strings.

For comparison methods, this returns a :class:`pandas.BooleanArray`
For comparision methods, this returns a :class:`pandas.BooleanArray`

>>> pd.array(["a", None, "c"], dtype="string") == "a"
<BooleanArray>
Expand Down
5 changes: 1 addition & 4 deletions pandas/core/dtypes/cast.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@
is_unsigned_integer_dtype,
pandas_dtype,
)
from .dtypes import DatetimeTZDtype, ExtensionDtype, IntervalDtype, PeriodDtype
from .dtypes import DatetimeTZDtype, ExtensionDtype, PeriodDtype
from .generic import (
ABCDataFrame,
ABCDatetimeArray,
Expand Down Expand Up @@ -601,9 +601,6 @@ def infer_dtype_from_scalar(val, pandas_dtype: bool = False):
if lib.is_period(val):
dtype = PeriodDtype(freq=val.freq)
val = val.ordinal
elif lib.is_interval(val):
subtype = infer_dtype_from_scalar(val.left, pandas_dtype=True)[0]
dtype = IntervalDtype(subtype=subtype)

return dtype, val

Expand Down
Loading