Last Updated: 

Why Does super() Raise 'TypeError: must be type, not classobj' in Python New-Style Class Inheriting from HTMLParser?

If you’ve ever tried to create a Python class that inherits from HTMLParser (from the html.parser module) and used super() to initialize the parent class, you might have encountered a puzzling error: TypeError: must be type, not classobj. This error can be frustrating, especially if you’re familiar with super() and new-style classes.

In this blog post, we’ll demystify this error by breaking down the differences between old-style and new-style classes in Python, examining the nature of HTMLParser, and explaining why mixing these concepts leads to the error. We’ll also provide actionable solutions to fix it.

Table of Contents#

  1. Understanding Old-Style vs. New-Style Classes
  2. What is HTMLParser?
  3. The Root Cause: Mixing Class Styles
  4. How super() Works with Method Resolution Order (MRO)
  5. Solutions to Fix the Error
  6. Example Code Walkthrough
  7. Conclusion
  8. References

Understanding Old-Style vs. New-Style Classes#

To understand the error, we first need to distinguish between old-style and new-style classes in Python. This distinction is critical because it affects how classes inherit, initialize, and interact with built-in functions like super().

Old-Style Classes (Python 2 Only)#

In Python 2, classes that do not explicitly inherit from object are called "old-style classes." They are defined like this:

class OldStyleClass:  # No inheritance from object
    pass

Old-style classes have:

  • A type of classobj (check with type(OldStyleClass)<type 'classobj'>).
  • No support for features like __slots__, property, or super().
  • A simple depth-first method resolution order (MRO) for inheritance.

New-Style Classes (Python 2 and 3)#

New-style classes inherit directly or indirectly from object. In Python 3, all classes are new-style by default, even if you don’t explicitly inherit from object. In Python 2, you must explicitly inherit from object to create a new-style class:

class NewStyleClass(object):  # Explicitly inherit from object (Python 2)
    pass
 
# Python 3: Implicitly inherits from object
class NewStyleClass:
    pass

New-style classes have:

  • A type of type (check with type(NewStyleClass)<type 'type'>).
  • Support for advanced features like super(), __mro__ (method resolution order), and descriptors.
  • A more predictable MRO based on the C3 linearization algorithm.

What is HTMLParser?#

HTMLParser is a built-in class in Python’s html.parser module (or HTMLParser module in Python 2) used to parse HTML and XML documents. It provides methods to handle tags, data, comments, and other HTML elements by overriding callback methods like handle_starttag or handle_data.

Key Note About HTMLParser’s Class Style:#

In Python 2, HTMLParser is an old-style class. It does not inherit from object, so its type is classobj (not type). This is critical because it means HTMLParser lacks the features required for new-style class interactions, such as compatibility with super().

In Python 3, all classes (including HTMLParser) are new-style by default, so this issue does not arise. The error discussed here is almost exclusively a Python 2 problem.

The Root Cause: Mixing Class Styles#

The error TypeError: must be type, not classobj occurs when you:

  1. Define a new-style class (by inheriting from object in Python 2).
  2. Inherit from an old-style class like Python 2’s HTMLParser.
  3. Use super() to initialize the parent class.

Here’s why this fails:
super() is designed for new-style classes and relies on the Method Resolution Order (MRO) stored in the __mro__ attribute of new-style classes. Old-style classes (like Python 2’s HTMLParser) do not have __mro__, and their type (classobj) is incompatible with the type expected by super(). When super() tries to resolve the parent class, it encounters HTMLParser (a classobj) instead of a new-style type, triggering the error.

How super() Works with Method Resolution Order (MRO)#

super(type, obj) returns a proxy object that delegates method calls to the next class in the MRO of type. For new-style classes, MRO is a tuple of classes that Python traverses to find the next method to call (e.g., __init__).

For example, in a new-style class hierarchy:

class A(object): pass
class B(A): pass
class C(B): pass
 
print(C.__mro__)  # Output: (<class '__main__.C'>, <class '__main__.B'>, <class '__main__.A'>, <type 'object'>)

super(C, self).__init__() would call B’s __init__, then A’s, then object’s.

Old-style classes have no __mro__ attribute, and super() cannot process their classobj type. When you mix a new-style subclass with an old-style parent (like HTMLParser in Python 2), super() cannot resolve the MRO and throws the classobj error.

Solutions to Fix the Error#

To resolve TypeError: must be type, not classobj, you have three main options:

1. Avoid super() and Call the Parent Constructor Directly#

Instead of using super(), explicitly call the old-style parent class’s __init__ method. This bypasses super()’s new-style requirements:

class MyParser(HTMLParser):  # Old-style subclass (Python 2)
    def __init__(self):
        HTMLParser.__init__(self)  # Direct call to old-style parent
        # ... rest of your code ...

2. Ensure All Parent Classes Are New-Style (Python 2)#

In Python 2, if you want to use super(), you must ensure all parent classes are new-style classes (i.e., they inherit from object). Since Python 2's HTMLParser is an old-style class that does not inherit from object, it cannot be used with super(). For old-style HTMLParser, the only options are to directly call HTMLParser.__init__(self) (as shown in Solution 1), or upgrade to Python 3 where all classes are new-style by default.

3. Upgrade to Python 3#

In Python 3, all classes are new-style, and HTMLParser inherits from object. Thus, mixing styles is impossible, and super() works as expected:

from html.parser import HTMLParser
 
class MyParser(HTMLParser):  # New-style by default (Python 3)
    def __init__(self):
        super().__init__()  # No error!

Example Code Walkthrough#

Let’s reproduce the error and fix it step-by-step.

Step 1: Reproduce the Error (Python 2)#

from HTMLParser import HTMLParser  # Python 2 import
 
class MyParser(HTMLParser, object):  # New-style subclass (inherits from object)
    def __init__(self):
        super(MyParser, self).__init__()  # Error here!
 
parser = MyParser()

Output:

TypeError: must be type, not classobj

Why? MyParser is a new-style class (type), but HTMLParser is an old-style class (classobj). super() expects all parent classes to be new-style (type), so it fails when it encounters HTMLParser.

Step 2: Fix with Direct Parent Call (Python 2)#

Replace super() with a direct call to HTMLParser.__init__:

from HTMLParser import HTMLParser
 
class MyParser(HTMLParser):  # Old-style subclass (no object inheritance)
    def __init__(self):
        HTMLParser.__init__(self)  # Directly call old-style parent
        print("Parser initialized!")
 
parser = MyParser()  # Output: Parser initialized! (No error)

Step 3: Python 3 Solution (No Error)#

In Python 3, HTMLParser is new-style, so super() works:

from html.parser import HTMLParser  # Python 3 import
 
class MyParser(HTMLParser):  # New-style by default
    def __init__(self):
        super().__init__()  # No error
        print("Parser initialized!")
 
parser = MyParser()  # Output: Parser initialized!

Conclusion#

The TypeError: must be type, not classobj when using super() with HTMLParser in Python 2 is caused by mixing new-style classes (inheriting from object) with old-style classes (like Python 2’s HTMLParser). super() relies on new-style class features (e.g., __mro__), which old-style classes lack.

To fix it:

  • In Python 2: Avoid super() and call HTMLParser.__init__(self) directly.
  • In Python 3: Use super() freely, as all classes are new-style.

Upgrading to Python 3 is the most sustainable solution, as it eliminates old-style classes entirely.

References#