Tutorial — openpyxl 3.1.3 documentation (2024)

Installation

Install openpyxl using pip. It is advisable to do this in a Python virtualenvwithout system packages:

$ pip install openpyxl

Note

There is support for the popular lxml library which will be used if itis installed. This is particular useful when creating large files.

Warning

To be able to include images (jpeg, png, bmp,…) into an openpyxl file,you will also need the “pillow” library that can be installed with:

$ pip install pillow

or browse https://pypi.python.org/pypi/Pillow/, pick the latest versionand head to the bottom of the page for Windows binaries.

Working with a checkout

Sometimes you might want to work with the checkout of a particular version.This may be the case if bugs have been fixed but a release has not yet beenmade.

$ pip install -e hg+https://foss.heptapod.net/openpyxl/openpyxl/@3.1#egg=openpyxl

Create a workbook

There is no need to create a file on the filesystem to get started with openpyxl.Just import the Workbook class and start work:

>>> from openpyxl import Workbook>>> wb = Workbook()

A workbook is always created with at least one worksheet. You can get it byusing the Workbook.active property:

>>> ws = wb.active

Note

This is set to 0 by default. Unless you modify its value, you will alwaysget the first worksheet by using this method.

You can create new worksheets using the Workbook.create_sheet() method:

>>> ws1 = wb.create_sheet("Mysheet") # insert at the end (default)# or>>> ws2 = wb.create_sheet("Mysheet", 0) # insert at first position# or>>> ws3 = wb.create_sheet("Mysheet", -1) # insert at the penultimate position

Sheets are given a name automatically when they are created.They are numbered in sequence (Sheet, Sheet1, Sheet2, …).You can change this name at any time with the Worksheet.title property:

ws.title = "New Title"

Once you gave a worksheet a name, you can get it as a key of the workbook:

>>> ws3 = wb["New Title"]

You can review the names of all worksheets of the workbook with theWorkbook.sheetname attribute

You can loop through worksheets

>>> for sheet in wb:...  print(sheet.title)

You can create copies of worksheets within a single workbook:

Workbook.copy_worksheet() method:

>>> source = wb.active>>> target = wb.copy_worksheet(source)

Note

Only cells (including values, styles, hyperlinks and comments) andcertain worksheet attributes (including dimensions, format andproperties) are copied. All other workbook / worksheet attributesare not copied - e.g. Images, Charts.

You also cannot copy worksheets between workbooks. You cannot copya worksheet if the workbook is open in read-only or write-onlymode.

Playing with data

Accessing one cell

Now we know how to get a worksheet, we can start modifying cells content.Cells can be accessed directly as keys of the worksheet:

>>> c = ws['A4']

This will return the cell at A4, or create one if it does not exist yet.Values can be directly assigned:

>>> ws['A4'] = 4

There is also the Worksheet.cell() method.

This provides access to cells using row and column notation:

>>> d = ws.cell(row=4, column=2, value=10)

Note

When a worksheet is created in memory, it contains no cells. They arecreated when first accessed.

Warning

Because of this feature, scrolling through cells instead of accessing themdirectly will create them all in memory, even if you don’t assign them a value.

Something like

>>> for x in range(1,101):...  for y in range(1,101):...  ws.cell(row=x, column=y)

will create 100x100 cells in memory, for nothing.

Accessing many cells

Ranges of cells can be accessed using slicing:

>>> cell_range = ws['A1':'C2']

Ranges of rows or columns can be obtained similarly:

>>> colC = ws['C']>>> col_range = ws['C:D']>>> row10 = ws[10]>>> row_range = ws[5:10]

You can also use the Worksheet.iter_rows() method:

>>> for row in ws.iter_rows(min_row=1, max_col=3, max_row=2):...  for cell in row:...  print(cell)<Cell Sheet1.A1><Cell Sheet1.B1><Cell Sheet1.C1><Cell Sheet1.A2><Cell Sheet1.B2><Cell Sheet1.C2>

Likewise the Worksheet.iter_cols() method will return columns:

>>> for col in ws.iter_cols(min_row=1, max_col=3, max_row=2):...  for cell in col:...  print(cell)<Cell Sheet1.A1><Cell Sheet1.A2><Cell Sheet1.B1><Cell Sheet1.B2><Cell Sheet1.C1><Cell Sheet1.C2>

Note

For performance reasons the Worksheet.iter_cols() method is not available in read-only mode.

If you need to iterate through all the rows or columns of a file, you can instead use theWorksheet.rows property:

>>> ws = wb.active>>> ws['C9'] = 'hello world'>>> tuple(ws.rows)((<Cell Sheet.A1>, <Cell Sheet.B1>, <Cell Sheet.C1>),(<Cell Sheet.A2>, <Cell Sheet.B2>, <Cell Sheet.C2>),(<Cell Sheet.A3>, <Cell Sheet.B3>, <Cell Sheet.C3>),(<Cell Sheet.A4>, <Cell Sheet.B4>, <Cell Sheet.C4>),(<Cell Sheet.A5>, <Cell Sheet.B5>, <Cell Sheet.C5>),(<Cell Sheet.A6>, <Cell Sheet.B6>, <Cell Sheet.C6>),(<Cell Sheet.A7>, <Cell Sheet.B7>, <Cell Sheet.C7>),(<Cell Sheet.A8>, <Cell Sheet.B8>, <Cell Sheet.C8>),(<Cell Sheet.A9>, <Cell Sheet.B9>, <Cell Sheet.C9>))

or the Worksheet.columns property:

>>> tuple(ws.columns)((<Cell Sheet.A1>,<Cell Sheet.A2>,<Cell Sheet.A3>,<Cell Sheet.A4>,<Cell Sheet.A5>,<Cell Sheet.A6>,...<Cell Sheet.B7>,<Cell Sheet.B8>,<Cell Sheet.B9>),(<Cell Sheet.C1>,<Cell Sheet.C2>,<Cell Sheet.C3>,<Cell Sheet.C4>,<Cell Sheet.C5>,<Cell Sheet.C6>,<Cell Sheet.C7>,<Cell Sheet.C8>,<Cell Sheet.C9>))

Note

For performance reasons the Worksheet.columns property is not available in read-only mode.

Values only

If you just want the values from a worksheet you can use the Worksheet.values property.This iterates over all the rows in a worksheet but returns just the cell values:

for row in ws.values: for value in row: print(value)

Both Worksheet.iter_rows() and Worksheet.iter_cols() cantake the values_only parameter to return just the cell’s value:

>>> for row in ws.iter_rows(min_row=1, max_col=3, max_row=2, values_only=True):...  print(row)(None, None, None)(None, None, None)

Data storage

Once we have a Cell, we can assign it a value:

>>> c.value = 'hello, world'>>> print(c.value)'hello, world'>>> d.value = 3.14>>> print(d.value)3.14

Saving to a file

The simplest and safest way to save a workbook is by using theWorkbook.save() method of the Workbook object:

>>> wb = Workbook()>>> wb.save('balances.xlsx')

Warning

This operation will overwrite existing files without warning.

Note

The filename extension is not forced to be xlsx or xlsm, although you might havesome trouble opening it directly with another application if you don’tuse an official extension.

As OOXML files are basically ZIP files, you can also open it with yourfavourite ZIP archive manager.

If required, you can specify the attribute wb.template=True, to save a workbookas a template:

>>> wb = load_workbook('document.xlsx')>>> wb.template = True>>> wb.save('document_template.xltx')

Saving as a stream

If you want to save the file to a stream, e.g. when using a web applicationsuch as Pyramid, Flask or Django then you can simply provide aNamedTemporaryFile():

>>> from tempfile import NamedTemporaryFile>>> from openpyxl import Workbook>>> wb = Workbook()>>> with NamedTemporaryFile() as tmp: wb.save(tmp.name) tmp.seek(0) stream = tmp.read()

Warning

You should monitor the data attributes and document extensionsfor saving documents in the document templates and vice versa,otherwise the result table engine can not open the document.

Note

The following will fail:

>>> wb = load_workbook('document.xlsx')>>> # Need to save with the extension *.xlsx>>> wb.save('new_document.xlsm')>>> # MS Excel can't open the document>>>>>> # or>>>>>> # Need specify attribute keep_vba=True>>> wb = load_workbook('document.xlsm')>>> wb.save('new_document.xlsm')>>> # MS Excel will not open the document>>>>>> # or>>>>>> wb = load_workbook('document.xltm', keep_vba=True)>>> # If we need a template document, then we must specify extension as *.xltm.>>> wb.save('new_document.xlsm')>>> # MS Excel will not open the document

Loading from a file

You can use the openpyxl.load_workbook() to open an existing workbook:

>>> from openpyxl import load_workbook>>> wb = load_workbook(filename = 'empty_book.xlsx')>>> sheet_ranges = wb['range names']>>> print(sheet_ranges['D18'].value)3

Note

There are several flags that can be used in load_workbook.

  • data_only controls whether cells with formulae have either the

    formula (default) or the value stored the last time Excel read the sheet.

  • keep_vba controls whether any Visual Basic elements are preserved or

    not (default). If they are preserved they are still not editable.

  • read-only opens workbooks in a read-only mode. This uses much less

    memory and is faster but not all features are available (charts, images,etc.)

  • rich_text controls whether any rich-text formatting in cells is

    preserved. The default is False.

  • keep_links controls whether data cached from external workbooks is

    preserved.

Warning

openpyxl does currently not read all possible items in an Excel file soshapes will be lost from existing files if they are opened and saved withthe same name.

Errors loading workbooks

Sometimes openpyxl will fail to open a workbook. This is usually because there is something wrong with the file.If this is the case then openpyxl will try and provide some more information. Openpyxl follows the OOXML specification closely and will reject files that do not because they are invalid. When this happens you can use the exception from openpyxl to inform the developers of whichever application or library produced the file. As the OOXML specification is publicly available it is important that developers follow it.

You can find the spec by searching for ECMA-376, most of the implementation specifics are in Part 4.

This ends the tutorial for now, you can proceed to the Simple usage section

Tutorial — openpyxl 3.1.3 documentation (2024)

FAQs

Which is better, XLSXWriter or openpyxl? ›

Both openpyxl and xlsxwriter have established themselves as popular choices in the Python community, with openpyxl enjoying a slight edge in overall popularity. This is understandable, considering that openpyxl provides a more comprehensive range of features.

Is openpyxl compatible with Python 3? ›

Installing openpyxl in your Python 3.12 environment should be simple enough. You will need to run pip install openpyxl , making sure to use the pip executable that belongs to Python 3.12, not 3.9.

How to use openpyxl in Python? ›

Reading Excel Files in Python with Openpyxl
  1. import openpyxl wb = openpyxl. load_workbook('videogamesales.xlsx') ...
  2. ws = wb. active. ...
  3. ws = wb['vgsales'] Let's now count the number of rows and columns in this worksheet:
  4. print('Total number of rows: '+str(ws. max_row)+'. ...
  5. Total number of rows: 16328.

What is the difference between pandas and openpyxl? ›

Pandas – Pandas is a Python library used for data manipulation and analysis. It provides data structures for efficiently storing and manipulating large datasets. Openpyxl – Openpyxl is a Python library used for reading and writing Excel files. It allows you to perform various operations on Excel files using Python.

Which Python library is best for Excel? ›

Python packages for Excel, such as OpenPyXL, XlsxWriter, and Pandas, are used to interact with Excel files. OpenPyXL enables users to read, create, and change Excel spreadsheets.

Can you use pandas and openpyxl together? ›

better together. In conclusion, openpyxl and pandas form a powerful duo for maximizing Excel's potential through Python automation.

Can you use openpyxl without Excel? ›

If you need to read and write Excel files from Python without running Excel (e.g. as part of a batch process) then most likely you will want to use openpyxl. To expose Python functions in Excel as worksheet functions, macros, ribbon tool bar and so on then PyXLL will enable you to do that.

Is openpyxl part of Anaconda? ›

To export data from a Python object into Excel or import the contents of an Excel spreadsheet to perform calculations or visualizations in Python, Anaconda includes the following libraries and modules: openpyxl–Read/write Excel 2007 xlsx/xlsm files. xlrd– Extract data from Excel spreadsheets–. xls and .

Can you use openpyxl for CSV? ›

OpenPyXL doesn't load CSV files directly. What we can do is use the Python csv module to pull in the csv file, iterate over the data and create a workbook out of it. We can do a reverse process to go from Excel into a CSV file. These are the packages we'll need for either direction we're using.

Why is openpyxl used? ›

Openpyxl is a Python library for reading and writing Excel files. It allows you to create, modify, and manage Excel files in a simple and efficient way. Here's a simple example of how to use it: from openpyxl import Workbook wb = Workbook() ws = wb.

Is openpyxl or pandas faster? ›

We can observe around 6x times improve when using Pandas calamine instead of the default Openpyxl and 8x faster Polar reading. Polar is also 65% faster than Pandas when both uses Calamine engine.

When to use openpyxl? ›

Using Openpyxl, you can write scripts that clean up your data automatically. Combine multiple files: If you're dealing with data that's spread across multiple Excel files, Openpyxl can help you combine them into a single file. Create custom functions: Sometimes the built-in Excel functions just don't cut it.

Is openpyxl built in Python? ›

openpyxl is a Python library to read/write Excel 2010 xlsx/xlsm/xltx/xltm files. It was born from lack of existing library to read/write natively from Python the Office Open XML format.

Which Excel plugin is best for Python? ›

PyXLL is an Excel Add-In that enables developers to extend Excel's capabilities with Python code. PyXLL makes Python a productive, flexible back-end for Excel worksheets, and lets you use the familiar Excel user interface to interact with other parts of your information infrastructure.

Which Excel extension is best? ›

The XLSX file extension is associated with files saved as Microsoft Excel (2007/2010), one of the most popular and powerful tools you can use to create and format spreadsheets, graphs and much more. The XLSX files are used in Microsoft Excel (2007/2010) for Workbooks, spreadsheet, and document files.

What is the best library to read Excel files? ›

IronXL is a powerful and comprehensive C# Excel library designed to be the ultimate solution for creating, reading, and manipulating Excel files. It prioritizes ease of use, cross-platform compatibility, and extensive features. IronXL is a . NET library for working with Excel files in C# and other .

Which is better, openpyxl or xlwings? ›

xlwings is a powerful library that allows you to interact with Excel files directly from Python. It provides a Pythonic way to automate Excel tasks, manipulate data, and run macros. i hope Using xlwings, you can automate your Excel tasks more reliably and efficiently compared to openpyxl and pandas.

Top Articles
Latest Posts
Article information

Author: Lidia Grady

Last Updated:

Views: 6122

Rating: 4.4 / 5 (45 voted)

Reviews: 84% of readers found this page helpful

Author information

Name: Lidia Grady

Birthday: 1992-01-22

Address: Suite 493 356 Dale Fall, New Wanda, RI 52485

Phone: +29914464387516

Job: Customer Engineer

Hobby: Cryptography, Writing, Dowsing, Stand-up comedy, Calligraphy, Web surfing, Ghost hunting

Introduction: My name is Lidia Grady, I am a thankful, fine, glamorous, lucky, lively, pleasant, shiny person who loves writing and wants to share my knowledge and understanding with you.